Copy File Bin Tanaka T21
T 21D.rom, T2XD.bin and T2XD.rfs. In this blog provides software tools, files (bin & rfs), and firmware that can be downloaded it for free to process updating firmware on your Yealink ip phone. How to use recovery mode update firmware. Upgarde bisa dilakukan melalui usb flashdisk. Tanaka T 21 New Samurai misalnya ketika ingin menikmati fitur tandberg maka harus diupdate terlebih dulu jika sebelumnya hanya bisa power vu.
Associated Data
All simulation and analysis code used in this study are available in Additional file 2. Data were downloaded using links provided by the original publications or acquired from the authors directly.
Abstract
Background
Aneuploidies are copy number variants that affect entire chromosomes. They are seen commonly in cancer, embryonic stem cells, human embryos, and in various trisomic diseases. Aneuploidies frequently affect only a subset of cells in a sample; this is known as “mosaic” aneuploidy. A cell that harbours an aneuploidy exhibits disrupted gene expression patterns which can alter its behaviour. However, detection of aneuploidies using conventional single-cell DNA-sequencing protocols is slow and expensive.
Methods
We have developed a method that uses chromosome-wide expression imbalances to identify aneuploidies from single-cell RNA-seq data. The method provides quantitative aneuploidy calls, and is integrated into an R software package available on GitHub and as an Additional file of this manuscript.
Results
We validate our approach using data with known copy number, identifying the vast majority of aneuploidies with a low rate of false discovery. We show further support for the method’s efficacy by exploiting allele-specific gene expression levels, and differential expression analyses.
Conclusions
The method is quick and easy to apply, straightforward to interpret, and represents a substantial cost saving compared to single-cell genome sequencing techniques. However, the method is less well suited to data where gene expression is highly variable. The results obtained from the method can be used to investigate the consequences of aneuploidy itself, or to exclude aneuploidy-affected expression values from conventional scRNA-seq data analysis.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-017-4253-x) contains supplementary material, which is available to authorized users.
Background
Aneuploidies are gains or losses of entire chromosomes. They occur commonly during early human development [1], cause some human disease (Edwards, Patau and Down syndromes), and are implicated in critical failures at the pre-implantation stage of development [2]. While the expression levels of genes on chromosomes with an aneuploidy are buffered in some cases, these mechanisms rarely fully compensate for the additional or missing gene copy and may only act on a gene-by-gene basis [3]. Aneuploidy-driven expression changes have been observed in yeast [4], Drosophila [5], and human cell lines at both the mRNA and the protein level [6].
Previous studies using array Comparative Genomic Hybridisation (aCGH) [1] have shown that aneuploidies that arise during very early embryonic development are frequently mosaic in character, such that the copy number gain or loss only affects a fraction of the cells in the embryo; these are referred to as mosaic aneuploidies. Similarly, mosaic aneuploidy is observed in populations of embryonic stem cells [7]. To properly characterise and investigate mosaic aneuploidy, it is therefore necessary to study individual cells rather than bulk populations.
Technological developments have facilitated the application of genomics techniques at single-cell resolution. This has allowed the genome, transcriptome, epigenome and proteome of individual cells to be molecularly characterised [8–12]. Furthermore, it has recently become possible to combine multiple sequencing modalities and apply them to the same cell: for example, parallel genome and transcriptome sequencing [13] (G&T-seq) has allowed the integration of copy number and mRNA expression information.
Nevertheless, such combined profiling is relatively unusual and most experiments focus on assaying only a single molecular feature. In particular, most published studies have focussed on single-cell RNA-sequencing (scRNA-seq) [8, 14–16]. Consequently, and given the relatively high prevalence of aneuploidies noted above, the ability to call such features directly from scRNA-seq data is highly desirable.
To this end, we have developed a method for calling aneuploidies from scRNA-sequencing data and applied it to a variety of different use cases. Our approach works by statistically identifying, separately for each cell, chromosomes with genes that show consistently deviant expression compared to the same chromosome in other cells. The efficacy of a similar approach has previously been demonstrated using tumour samples [17], where different clonal populations of cells could be visually distinguished. However, that method does not make explicit ploidy calls. Our method shows high levels of sensitivity and specificity when using known copy-number information from G&T-seq data. Moreover, its predictions are supported by allele-specific expression information and differential expression analysis.
Methods
Let cgij denote the normalised (Counts Per Million, CPM) expression level for gene g on chromosome i in cell j. Furthermore, let
denote the expression of gene g on chromosome i in cell j normalized by the median expression of the same gene across cells. We consider only highly expressed genes (see “Operational information” section, below) to reduce the effects of technical artefacts common to scRNA-seq as well as to prevent occurrence of extreme values of agij.
Subsequently, for every cell-chromosome combination we define
bij depends on the number of genes g considered on chromosome i. To make this sum comparable across chromosomes, which contain different numbers of genes, we normalise by the number of considered genes on each chromosome, Gi:
Finally, within each cell, we convert this ratio into a score centred at 1 across chromosomes:
Assuming that no chromosome within a cell has a copy number gain or loss, sij will deviate randomly around 1. By contrast, if specific chromosomes possess evidence of an aneuploidy their scores will be elevated or reduced accordingly. A graphical representation is shown in Fig. 1a.a. Note that this interpretation assumes that the majority of chromosomes within a cell are not affected by the same type of aneuploidy.
Hanging File Bin
Successful detection of aneuploidies from scRNA-seq data. a Overview of the method. Cells with aneuploid chromosomes (purple and green) have altered levels of transcription of genes on the affected chromosome (less and more, respectively). For a given chromosome and cell, we compute a score for how deviant the overall expression of genes on that chromosome is relative to that in other cells. b We applied our method to 8-cell stage mouse embryos that were sequenced via a parallel genome and transcriptome method (G&T-seq). Our method performs well compared to the ground truth provided by genomic sequencing (sensitivity 78.0%, specificity 99.5%, FDR 11.4%). The chromosome with high Z-score in embryo F is not called as aneuploid as it does not pass an effect size threshold (“Methods” section)
To infer whether a cell-chromosome displays aberrant copy number, we converted sij into a Z-score, where the variance was estimated separately for each chromosome across cells using the median absolute deviation (MAD). We identified aneuploid chromosomes using an FDR-corrected p<0.1, where the p-value was obtained using Student’s t-distribution.
Operational information
As input, we considered all cells that passed quality control using the criteria employed in each analysed dataset.
We only consider highly expressed genes (median CPM>50) to prevent inclusion of genes where small differences in expression across cells could make large differences to agij.
DeviantArt is the world's largest online social community for artists and art enthusiasts, allowing people to connect through the creation and sharing of art. Feb 1, 2018 - the vore game by Naughty hungry lion. If you log in before playing, you'll be able to save your progress - which means you. Similar Games. Voreville's Vore Games Picks. Greetings fellow vore fans! Looking for some cool vore games to play with? Here's just a few games we found off the internet that.
Before utilising the method, Principal Component Analysis (PCA) was applied to the log-transformed highly expressed genes to identify substructure in the data. The presence of sub-structure will be driven by differentially expressed genes across cells. Consequently, jointly analysing cells in this setting would result in chromosome scores driven by differentially expressed genes rather than aneuploidy. Therefore, if cell groupings were observed, we assigned cells into different groups and analysed them separately.
Finally, to call an aneuploidy we not only required that the corrected p-value was less than 0.1, but also imposed an effect size threshold such that cell-chromosomes where 0.8<sij<1.2 were not considered significant. This is analogous to approaches commonly applied in microarray and RNA-sequencing analyses when detecting differentially expressed genes.
Allele-specific expression analysis
To identify biases in allele-specific expression that may be indicative of aneuploidy we considered, for each cell-chromosome, the total number of reads that could be uniquely allocated to one allele. Subsequently, we computed an allele-ratio as the ratio of the total number of reads from one allele over the total number of reads that could be uniquely assigned to either allele. To ensure the median ratio was the same across all chromosomes, we median centred the computed ratios on a per chromosome basis.
At different stages of embryonic development, progressive activation of the paternal genome results in systematically different allele ratios between cells across stages. To ensure that allele ratios were comparable between cells of different stages, we additionally median centred the allele ratios for all chromosomes in each embryo. This step also corrects for further embryo-specific allele-ratio biases.
Differential expression analysis
To find genes that were associated with aneuploidy, we performed differential expression (DE) analysis using two scRNA-seq datasets (mouse embryos [18] and mESCs [19]). First, we subsetted the data such that it contained only genes that are expressed at a mean level above 10 counts per million (CPM) or more in both datasets. For each dataset, we subsequently performed differential expression analysis using edgeR [20] between cells called as diploid and those that contain at least one chromosome our method called as aneuploid. We added batch (for the mESCs) and embryo number (for the embryos) as covariates to account for technical effects. We called differentially expressed genes as those with an FDR-corrected p<0.1.
We considered genes that were either downregulated in both datasets or upregulated in both datasets as a high-confidence set of genes that have altered expression levels in aneuploid cells.
Results
To detect aneuploidies from scRNA-sequencing data we computed, for a defined group of cells, a score for each chromosome-cell pair that measures whether the expression level of genes in that chromosome differs substantially from other cells. We converted this score into a p-value and used this to detect significant deviations, which we interpret as providing evidence for the presence of an aneuploidy (“Methods” section). The algorithm is available in an R package (scploid) provided in Additional file 1, with the latest version also available at https://github.com/MarioniLab/Aneuploidy2017.
True aneuploidies are detected at a low false positive rate
To assess our method’s performance, we first applied it to a dataset where DNA copy number was known. This dataset processed cells from 8-cell stage mouse embryos using a combined genome-and-transcriptome (G&T-seq) strategy [13]. Here, the mRNA and DNA from a single cell are physically separated from one another and processed in parallel to provide transcriptomic and genomic information about the same cell. We therefore used the genomic copy number calls to assess our method’s performance on the transcriptome data.
Grouping cells by treatment (reversine or control), our method identified aneuploidies with a sensitivity of 78.0% (from 50 real aneuploidies) and FDR of 11.4% (Fig. 1b).b). Importantly, for chromosomes with no evidence of copy number changes, the p-values derived from the Z-scores were uniformly distributed, suggesting the assumptions made by our model are reasonable (see Additional file 2: Figures S3–S4). Of the 11 false negative aneuploidy calls, 8 were found in the aneuploidy rich embryo E, which showed considerably increased incidence of aneuploidy compared to the other embryos. These chromosomes frequently showed low levels of total expression deviation compared to normal ploidy chromosomes (Additional file 2: Figure S10). By contrast, the 5 false positives were spread across embryos and chromosomes. We did not call any aneuploidies in the control (non-treated) cells, concordant with DNA sequencing copy number calls.
High gene expression variance confounds our model
In addition, G&T-sequencing was also applied to cells from an immortalized lymphoblastoid cell line, HCC38-BL [13]. Although derived from normal (primarily diploid) cells, G&T-sequencing identified seven copy number changes (three of which affected only one chromosomal arm). Our method identified only two of these changes, albeit the remaining aneuploid chromosomes had scores that were consistent with the change in copy number but not statistically significant. Our method also made several false positive calls. Performance metrics are shown in Fig. 2aa.
High variability in gene expression levels compromises performance. a Our method performs less well on cell-line G&T-sequencing data than on the mouse embryos. All cells were considered for the 8-cell embryos and HCC38-BL data. Trisomy 21 cells were downsampled to a ratio of 1 T21 cell : 4 control cells (normal ploidy chromosome 21), to ensure that these aneuploidies were in the minority and therefore can be detected. b The datasets with poor performance show more variable gene expression profiles. For 500 genes selected at random from each dataset (navy: HCC38-BL cell line; yellow: Reversine-treated 8-cell embryos; cyan: trisomy 21 iPS derived neurons) we plot the (log) standard deviation of expression (y−axis) against the (log) mean expression (x−axis). A linear model was fitted separately for each dataset using genes with a median count (per million reads) of at least 50 and overlaid. c Simulated datasets with different dispersion parameters are shown. We simulated four datasets to assess the impact of gene expression variability on the performance of our method. Genes from each simulation are shown and the different dispersion parameters used in the simulation are noted. The regression lines from the fit in 2b are overlaid. d As the data become more variable, the performance of our method degrades. For simulations with variability comparable to the HCC38-BL and Trisomy 21 neuron datasets, the sensitivity and precision are considerably impacted. Reported values are the mean of 10 simulations
To explore what might underpin this observation, we investigated the scRNA-sequencing data derived from these cells in greater detail. Relative to the 8-cell embryo cells, the expression profiles of the HCC38-BL cells were more variable (Fig. 2b).b). Download imo the world of magic gold hack. To investigate how this increased variance might affect the performance of our method, we simulated data based upon the 8-cell stage data introduced earlier. This allowed us to adjust the degree of variability in expression while realistically controlling for both gene expression levels and the relationship between variability and expression (Fig. 2c,c, Additional file 2: Section 3.4).
Simulations where the mean-variance relationship was similar to the real 8-cell embryo data yielded performance approximately equal to that of the real data. However, when we increased the level of variability in the expression profiles to that observed in the HCC38-BL cells, we observed a substantial decrease in the performance of our approach (see Fig. 2d).d). Similar behaviour was observed in the final set of cells profiled using the G&T protocol, a set of Trisomy 21 and normal ploidy neurons derived from induced pluripotent stem cells (Additional file 2: Section 4).
In sum, the relatively poor performance of our approach when applied to both simulated and real data with higher levels of variance in gene expression levels across cells demonstrates that our ability to detect copy number changes is, unsurprisingly, heavily influenced by the underlying variability in the analysed expression profiles (see “Discussion” section).
Using allele specific expression to validate copy number calls
It has been reported that the quality of scRNA-seq data derived using joint protocols is less consistent and sometimes lower in quality than when only the mRNA is profiled [21]. Given the relationship between our method’s performance and the amount of noise in the data (explored using the simulations above), we thus applied it to conventional scRNA-seq data. Importantly, we considered scRNA-seq data generated from F1 intercrosses between two inbred strains of mice [18, 19], where each allele is derived from a distinct genetic background.
The presence of an aneuploidy will create an imbalance between the copy numbers of alleles from the affected chromosome, which should lead to an expression imbalance between the two sets of alleles. While this does not provide as definitive a conclusion as DNA sequencing data, a cell that contains such an expression imbalance is nonetheless more likely to contain an aneuploidy. This approach, which considers only the allele-specific expression counts, is orthogonal to our existing method, which considers the total gene expression levels. Therefore agreement between the two approaches would offer support for the efficacy of our method without the use of G&T-seq.
Specifically, we have considered a set of F1 cells derived from the embryos of C57BL/6J × CAST/EiJ mice (two cell stage to late blastocyst) [18]. First, we applied our aneuploidy detection method on total expression levels to identify candidate aneuploidies. Independently, we considered the allele-specific counts, recording the relative contribution to each chromosome’s total allele-mappable expression from each set of alleles. Chromosomes that score highly in our total expression method show significantly higher levels of deviation from balanced allelic expression than other chromosomes (p<10−9; Fig. 3a),a), thus supporting the validity of our aneuploidy calls. We noted that monosomic-called chromosomes show more severe deviation than those called as trisomic (Fig. 3b),b), as expected from the total absence of an allele set in cases of monosomy compared to a smaller change in allele proportion in trisomies. We observed very similar behaviour in another F1 cross dataset of cultured mouse embryonic stem cells [19] (see Additional file 2: Section 6).
Allele-specific expression datasets support the method’s aneuploidy calls. a Chromosomes with higher-confidence aneuploidy calls show greater allele-specific expression (ASE) deviation than cells that are not called (p<10−9, Mann-Whitney U test). The ASE deviation was corrected for systematic ASE differences for each embryo and for each autosome (Methods). Z-scores to the right of the dashed line are significant after FDR-correction. Values above Z-score bins are the number of chromosomes in each bin. b Trisomic and monosomic chromosomes show greater ASE deviation than chromosomes with normal-ploidy (p<10−8 and p<10−2, respectively; Mann-Whitney U test). Black circles indicate the mean deviation. Violin plots are area normalised. Values above the plots are the number of chromosomes considered
Differential expression analysis identifies genes associated with aneuploidy
To explore the utility of the aneuploidy predictions made by our method, we next performed transcriptome-wide differential expression analysis between all cells called as aneuploid (irrespective of the affected chromosome(s)) and those called as diploid in the two large scRNA-seq datasets analysed above (mouse embryos [18] and cultured mESCs [19]). In total, we identified 22 genes that were commonly upregulated in aneuploid cells in both datasets (“Methods” section & Additional file 2: Section 8; FDR <0.1). No shared downregulated genes were found.
Of these genes, a number show particular relevance for aneuploidy: Gas5 is a noncoding RNA with roles in growth arrest [22] and apoptosis [23]; Txnip overexpression results in G1 cell-cycle arrest [24]; and Rps27l overexpression has been shown to promote p53 activity [25], resulting in cell-cycle arrest and apoptosis. Apoptotic and growth arrest functions are known to be associated with aneuploidy [26].
The activation of the unfolded protein response is also linked with aneuploidy [27]. Three of the differentially expressed genes have roles in this pathway: Calnexin (Canx), Pdia3 [28], and Sdf2 [29]. Sdf2 differential expression is additionally associated with human oocyte aneuploidy [30].
The roles of many of these differentially expressed genes in aneuploidy-related pathways provides further support for the performance of our method.
Discussion
Our analysis demonstrates that changes in DNA copy number at the single-cell level can be inferred directly from single-cell RNA-sequencing data. One significant caveat is the relationship between the method’s performance and the degree of noise in the data; such an increase in noise can be driven by several factors. Heterogeneity between cell populations, if not accounted for by clustering cells into homogeneous groups, can lead to systematic chromosome score differences as a result of genes being differentially expressed between populations. Furthermore, cells that are transcriptionally more variable increase the amount of aneuploidy-independent gene expression variance present, which compromises our method’s performance. Such an increase in variance can be driven by technical and biological effects, the former of which is particularly relevant for scRNA-seq, where different protocols generate different amounts of technical noise [31].
Despite this, we demonstrate the good performance of our method across a variety of cell types and conditions. In practice, we suggest that a user exercise caution over the following characteristics of their data. Subpopulations of cells should be identified and analysed separately to reduce aneuploidy-independent expression variance, akin to new approaches for normalising scRNA-seq data [32]. The method should be applied to cells that are relatively phenotypically normal (for example, large genome rearrangements move cells away from integer copy number values and hinder chromosomal gene assignment).
Additionally, as for all scRNA-sequencing analysis approaches, the degree of noise in the data is vital. We have identified three easily computable metrics to assess this: a gene-wise score that captures increased or decreased expression variability compared to the G&T-seq 8-cell embryos; the total number of genes that qualify as highly expressed (median CPM > 50); and the fraction of zero counts observed in these genes — a lower variability score, more available genes, and fewer zeros are features that enable application of our method. Importantly, when applying these metrics to recently generated single-cell RNA-seq datasets [16, 33] we observed that the method appears to be well suited to analyse contemporary scRNA-seq data (see Additional file 2: Section 7 for further detail of these metrics and datasets).
One further concern is whether cells captured at different stages of the cell cycle will lead to differences in copy number driven by variability in the process by which chromosomes are replicated. However, previous work has suggested that dosage control is tightly regulated during the cell cycle [34], mitigating this effect. Additionally, when using an existing classifier to assign the mESCs and developing mouse embryos analysed above to different cell cycle phases [35], we did not observe an association between the degree of aneuploidy and cell cycle (see Additional file 2: Section 9).
Moving forward, it may be possible to integrate more information to increase predictive power. For example, classifiers have previously been trained to identify aneuploid cells based on specific transcriptional signatures [36]. Alternatively, where possible, allele-specific expression information could be coupled with total expression data, which would be especially helpful when detecting copy number deletion events.
Jul 11, 2018 - Dynamips can run unmodified IOS images. In the new GNS3 1.4, there is a way to run a second category of switches and routers. Cisco 7200 appliance. Cisco 7200 Router. Cisco 7200 124-24.T5. T5.image, 6b89d0d804e1f2bb5b8bda, 102.0 MB, No download link available. Cisco 7200 router ios image free download for gns3. Jul 14, 2014 - All the IOS, IOU, Nexus and other images has been updated, checked and verified with. Feel free to download them and use them for your Cisco certifications studies. 3745 – 7200 (v15). Of this without having dedicated network hardware such as routers and switches.
Having acquired ploidy information about a single-cell RNA-seq dataset through this method, the results can be exploited in a number of ways to obtain further biological insight. For example, gene expression counts on chromosomes that are suspected to be aneuploid may be excluded when selecting highly variable genes [37] to prevent propagation of aneuploidy-driven signal into downstream analyses. Additionally, as shown above, segregating cell-chromosomes by whether or not they harbour an aneuploidy can help identify genes that are potentially associated with copy number aberrations.
The method may provide particular benefits in stem cell and embryonic research, where aneuploidies are known to be common [2, 7, 38] and single-cell approaches have been widely applied. Additionally, the method is straightforward to apply (requiring no additional experimental work) and easy to interpret, yielding direct aneuploidy calls for each cell-chromosome unlike previous strategies [17]. Application of this method also represents a considerable cost saving compared to expensive and experimentally more complex single-cell genome sequencing protocols [39].
Conclusions
In this paper, we have shown that chromosome-wide imbalances in mRNA gene expression measured using scRNA-seq can be used to identify aneuploidy. We have demonstrated this using ground-truth ploidy knowledge (parallel genomic & transcriptomic sequencing), allele-specific expression ratios over chromosomes, and differential expression between cells with and without called aneuploidies. The method is straightforward to apply, albeit care must be taken to control for potential confounding factors. Downstream applications include identification of aneuploidies for gaining insight into their cause and / or consequences, as well as to exclude expression values from aneuploid chromosomes to improve the accuracy of common analysis techniques.
Additional files
Additional file 1(6.2M, gz)Code for analysis. A gzipped tarball containing all code used for analysis, as well as the.html report referred to above. A package to run aneuploidy assessment in R is also included, alongside a script to download the data we have used. The latest version of these files may be found on https://github.com/MarioniLab/Aneuploidy2017. (GZ 6405 kb)
Additional file 2Analysis report. A.html file that details all the analysis included herein, including additional figures referred to in the manuscript as well as some further analyses. (HTML 8315 kb)
Acknowledgements
Thanks to Thierry Voet and his research group for discussions and help with data access.
Funding
JAG was supported by Wellcome Trust Grant “Systematic Identification of Lineage Specification in Murine Gastrulation” (109081/Z/15/A). AS was supported by Wellcome Trust Grant “Tracing early mammalian lineage decisions by single cell genomics” (105031/B/14/Z). JCM was supported by core funding from Cancer Research UK (award no. A17197) and by core funding from EMBL. The funding bodies played no role in the design of the study or collection, analysis and interpretation of data or writing of the manuscript.
Availability of data and materials
All simulation and analysis code used in this study are available in Additional file 2. Data were downloaded using links provided by the original publications or acquired from the authors directly.
Abbreviations
aCGH | Array competitive genome hybridisation |
CPM | Counts per million |
DE | Differential expression |
G&T-seq | Genome and transcriptome sequencing |
MAD | Median absolute deviation |
PCA | Principal components analysis |
scRNA-seq | Single-cell RNA sequencing |
Authors’ contributions
JAG performed analysis. JAG, AS and JCM wrote the manuscript. AS and JCM supervised the study. All authors have read and approved the manuscript.
Notes
Ethics approval and consent to participate
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Footnotes
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-017-4253-x) contains supplementary material, which is available to authorized users.
Contributor Information
Jonathan A. Griffiths, Email: ku.ca.mac@612gaj.
Antonio Scialdone, Email: ed.nehcneum-ztlohmleh@enodlaics.oinotna.
John C. Marioni, Email: ku.ca.ibe@inoiram.
References
Open File Bin
Plastic File Bin
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Moshe Y. Vardi Rice University, Houston, TX, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
4867
Sehun Kim Moti Yung Hyung-Woo Lee (Eds.)
Information Security Applications 8th International Workshop, WISA 2007 Jeju Island, Korea, August 27-29, 2007 Revised Selected Papers
13
Volume Editors Sehun Kim KAIST, Department of Industrial Engineering 373-1, Guseong-dong, Yuseong-gu, Daejeon, 305-701, Korea E-mail: [email protected] Moti Yung Google Inc. Columbia University, Computer Science Department RSA Laboratories S.W.Mudd Building, New York, NY10027, USA E-mail: [email protected] Hyung-Woo Lee Hanshin University School of Computer Engineering 411, Yangsan-dong, Osan, Gyunggi, 447-791, Korea E-mail: [email protected]
Library of Congress Control Number: 2007942182 CR Subject Classification (1998): E.3, D.4.6, F.2.1, C.2, J.1, C.3, K.6.5 LNCS Sublibrary: SL 4 – Security and Cryptology ISSN ISBN-10 ISBN-13
0302-9743 3-540-77534-X Springer Berlin Heidelberg New York 978-3-540-77534-8 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2007 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper SPIN: 12210664 06/3180 543210
Preface
The 8th International Workshop on Information Security Applications (WISA 2007) was held on Jeju Island, Korea during August 27–29, 2007. The workshop was sponsored by the Korea Institute of Information Security and Cryptology (KIISC), the Electronics and Telecommunications Research Institute (ETRI) and the Ministry of Information and Communication (MIC). WISA aims at providing a forum for professionals from academia and industry to present their work and to exchange ideas. The workshop covers all technical aspects of security applications, including cryptographic and non-cryptographic techniques. We were very pleased and honored to serve as the Program Committee Co-chairs of WISA 2007. The Program Committee received 95 papers from 20 countries, and accepted 27 papers for the full presentation track. The papers were selected after an extensive and careful refereeing process in which each paper was reviewed by at least three members of the Program Committee. In addition to the contributed papers, the workshop had three special talks. Moti Yung gave a tutorial talk, entitled “Somebody You Know: The Fourth Factor of Authentication.” Kihong Park and Nasir Memon gave invited talks, entitled “Reactive Zero-Day Attack Protection” and “Securing Biometric Templates,” respectively. Many people deserve our gratitude for their generous contributions to the success of the workshop. We would like to thank all the people involved in the technical program and in organizing the workshop. We are very grateful to the Program Committee members and the external referees for their time and efforts in reviewing the submissions and selecting the accepted papers. We also express our special thanks to the Organizing Committee members for their hard work in organizing the workshop. Last but not least, on behalf of all those involved in organizing the workshop, we would like to thank all the authors who submitted papers to this workshop. Without their submissions and support, WISA could not have been a success.
December 2007
Sehun Kim Moti Yung Hyung-Woo Lee
Organization
Advisory Committee Man-Young Rhee Hideki Imai Mun Kee Choi Bart Preneel Kil-Hyun Nam Sang-Jae Moon Dong-Ho Won Pil-Joong Lee Dae-Ho Kim Joo-Seok Song
Kyung Hee University, Korea Tokyo University, Japan ETRI, Korea Katholieke Universiteit Leuven, Belgium Korea National Defense University, Korea Kyungpook National University, Korea Sungkyunkwan University, Korea POSTECH, Korea NSRI, Korea Yonsei University, Korea
General Co-chairs Min Surp Rhee Sung-Won Sohn
Dankook University, Korea ETRI, Korea
Steering Committee Kyo-Il Chung TaeKyoung Kwon Im-Yeong Lee Dong-Il Seo OkYeon Yi Jae-Kwang Lee
ETRI, Korea Sejong University, Korea Soonchunhyang University, Korea ETRI, Korea Kookmin University, Korea Hannam University, Korea
Organizing Committee Chair Finance Publication Publicity
Sang Choon Kim Taenam Cho Ji-Young Lim Gang Shin Lee Heuisu Ryu
Registration Treasurer Local Arrangements
Yoonjeong Kim Jaehoon Nah Khi Jung Ahn Dohoon Lee
Kangwon National University, Korea Woosuk University, Korea Korean Bible University, Korea KISA, Korea Gyeongin National University of Education, Korea Seoul Women’s University, Korea ETRI, Korea Cheju National University, Korea NSRI, Korea
VIII
Organization
Program Committee Co-chairs Sehun Kim Moti Yung Hyung-Woo Lee Members Gildas Avoine Lejla Batina Mike Burmester Ki-Joon Chae Myeonggil Choi Bruno Crispo Sven Dietrich Helena Handschu Heng Swee Huay Maria Isabel Gonzalez Vasco Kil-Hyun Jeong Gildas Avoine Soon-Won Jung Stefan Katzenbeisser Seungjoo Kim Seokwoo Kim Brian King Hong Seung Ko Dong Hoon Lee Pil Joong Lee Chae-Hun Lim Dongdai Lin Mose Liskov Michael Locasto Havier Lopez Masahiro Mambo Jung Chan Na Shozo Naito Yoram Ofek Heekuck Oh Susan Pancho-Festin In-Jae Park Duong Hieu Phan, Raphael C.-W. Phan Vassilis Prevelakis
KAIST, Korea Columbia University, USA Hanshin University, Korea
MIT, CSAIL, USA University of Leuven, Belgium Florida State University, USA Ewha University, Korea Inje University, Korea University of Trento, Italy CERT, CMU, USA Spansion, France Multimedia University, Malaysia Universidad Rey Juan Carlos, Spain Jangan College, Korea MIT, CSAIL, USA NITGEN, Korea Philips Research, Netherlands Sungkyunkwan University, Korea Hansei University, Korea Indiana University at Purdue, USA Kyoto College of Graduate Studies for Informatics, Japan CIST, Korea University, Korea POSTECH, Korea Sejong University, Korea SKLIS, Chinese Academy of Sciences, China William and Mary College, USA Columbia, USA University of Malaga, Spain Tsukuba, Japan ETRI, Korea Kyoto College of Graduate Studies for Informatics, Japan University of Trento, Italy Hanyang University, Korea University of the Philippines, Phillipines Dream Security, Korea University College London, UK EPFL, Switzerland Drexel University, USA
Organization
C. Pandu Rangan Kyung-Hyune Rhee Pankaj Rohatgi Ahmad-Reza Sadeghi Kouichi Sakurai Radu Sion Ki-Wook Sohn Francois-Xavier Standaert Yannis Stamatiou Koutarou Suzuki Huaxiong Wang Duncan Wong Heung-Youl Youm Rui Zhang Jianying Zhou
IX
IIT Madras, India Pukyong National University, Korea IBM Resaerch, USA Ruhr University, Bochum, Germany Kyushu University, Japan SUNY, Stony Brook, USA NSRI, Korea Louvaine University, Belgium University of Ioannina, Greece NTT Labs, Japan Nanyang Technological University, Singapore City University, Hong Kong Soonchunhyang University, Korea AIST, Japan Inst. for Infocomm Research, Singapore
Table of Contents
Public Key Crypto Applications Universal ηT Pairing Algorithm over Arbitrary Extension Degree . . . . . . . Masaaki Shirase, Yuto Kawahara, Tsuyoshi Takagi, and Eiji Okamoto Convertible Undeniable Proxy Signatures: Security Models and Efficient Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei Wu, Yi Mu, Willy Susilo, and Xinyi Huang Secret Signatures: How to Achieve Business Privacy Efficiently? . . . . . . . Byoungcheon Lee, Kim-Kwang Raymond Choo, Jeongmo Yang, and Seungjae Yoo
1
16
30
Biometrics/Information Hiding Implementation of BioAPI Conformance Test Suite Using BSP Testing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jihyeon Jang, Stephen J. Elliott, and Hakil Kim
48
Information Hiding in Software with Mixed Boolean-Arithmetic Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yongxin Zhou, Alec Main, Yuan X. Gu, and Harold Johnson
61
Geometrically Invariant Image Watermarking in the DWT Domain . . . . . Shijun Xiang and Hyoung-Joong Kim
76
Secure Hardware Implementation of LSM-Based RBAC Module for Embedded System . . . Jae-Deok Lim, Sung-Kyong Un, Jeong-Nyeo Kim, and ChoelHoon Lee
91
Iteration Bound Analysis and Throughput Optimum Architecture of SHA-256 (384, 512) for Hardware Implementations . . . . . . . . . . . . . . . . . . . Yong Ki Lee, Herwin Chan, and Ingrid Verbauwhede
102
A Compact Architecture for Montgomery Elliptic Curve Scalar Multiplication Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yong Ki Lee and Ingrid Verbauwhede
115
XII
Table of Contents
Secure Systems Windows Vault: Prevention of Virus Infection and Secret Leakage with Secure OS and Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yoshiki Sameshima, Hideaki Saisho, Tsutomu Matsumoto, and Norihisa Komoda
128
An Architecture Providing Virtualization-Based Protection Mechanisms Against Insider Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Frederic Stumpf, Patrick R¨ oder, and Claudia Eckert
142
Detecting Motifs in System Call Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . William O. Wilson, Jan Feyereisl, and Uwe Aickelin
157
Wireless and Mobile Security Comparative Studies in Key Disagreement Correction Process on Wireless Key Agreement System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Toru Hashimoto, Takashi Itoh, Masazumi Ueba, Hisato Iwai, Hideichi Sasaoka, Kazukuni Kobara, and Hideki Imai Breaking 104 Bit WEP in Less Than 60 Seconds . . . . . . . . . . . . . . . . . . . . . Erik Tews, Ralf-Philipp Weinmann, and Andrei Pyshkin Efficient Implementation of the Pairing on Mobilephones Using BREW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Motoi Yoshitomi, Tsuyoshi Takagi, Shinsaku Kiyomoto, and Toshiaki Tanaka
173
188
203
Application Security/Secure Systems Security Analysis of MISTY1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hidema Tanaka, Yasuo Hatano, Nobuyuki Sugio, and Toshinobu Kaneko
215
A Generic Method for Secure SBox Implementation . . . . . . . . . . . . . . . . . . Emmanuel Prouff and Matthieu Rivain
227
On the Security of a Popular Web Submission and Review Software (WSaR) for Cryptology Conferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Swee-Won Lo, Raphael C.-W. Phan, and Bok-Min Goi
245
Access Control/DB Security Authorization Constraints Specification of RBAC . . . . . . . . . . . . . . . . . . . . Lilong Han, Qingtan Liu, and Zongkai Yang
266
Table of Contents
XIII
Dynamic Access Control Research for Inter-operation in Multi-domain Environment Based on Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhuo Tang, Ruixuan Li, Zhengding Lu, and Zhumu Wen
277
A Compositional Multiple Policies Operating System Security Model . . . Lei Xia, Wei Huang, and Hao Huang
291
Smart Cards/Secure Systems Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Colin D. Walter Differential Power Analysis of HMAC Based on SHA-2, and Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert McEvoy, Michael Tunstall, Colin C. Murphy, and William P. Marnane Provably Secure Countermeasure Resistant to Several Types of Power Attack for ECC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . JaeCheol Ha, JeaHoon Park, SangJae Moon, and SungMing Yen
303
317
333
Anonymity and P2P Security Risk & Distortion Based K-Anonymity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shenkun Xu and Xiaojun Ye
345
Optimizing Quality Levels and Development Costs for Developing an Integrated Information Security System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Myeonggil Choi and Sangmun Shin
359
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junsheng Chang, Huaimin Wang, Gang Yin, and Yangbin Tang
371
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
387
Universal ηT Pairing Algorithm over Arbitrary Extension Degree Masaaki Shirase1 , Yuto Kawahara1, Tsuyoshi Takagi1 , and Eiji Okamoto2 1
Future University-Hakodate, Japan 2 University of Tsukuba, Japan
Abstract. The ηT pairing on supersingular is one of the most efficient algorithms for computing the bilinear pairing [3]. The ηT pairing defined over finite field F3n has embedding degree 6, so that it is particularly efficient for higher security with large extension degree n. Note that the explicit algorithm over F3n in [3] is designed just for n ≡ 1 (mod 12), and it is relatively complicated to construct an explicit algorithm for n 0003≡ 1 (mod 12). It is better that we can select many n’s to implement the ηT pairing, since n corresponds to security level of the ηT pairing. In this paper we construct an explicit algorithm for computing the ηT pairing with arbitrary extension degree n. However, the algorithm should contain many branch conditions depending on n and the curve parameters, that is undesirable for implementers of the ηT pairing. This ηT pairing), which satispaper then proposes the universal ηT pairing (0002 fies the bilinearity of pairing (compatible with Tate pairing) without any branches in the program, and is as efficient as the original one. Therefore the proposed universal ηT pairing is suitable for the implementation of various extension degrees n with higher security. Keywords: Tate pairing, ηT pairing, Duursma-Lee algorithm, efficient implementation.
1
Introduction
Recently, bilinear pairings defined on elliptic curves such as Tate pairing and the ηT pairing have been attracted to make new cryptographic protocols, for example, identity-based cryptosystem [5], short signature [7] and efficient broadcast cryptosystem [6], come true. A standard algorithm for computing the Tate pairing is Miller algorithm [12]. The computational cost of Miller algorithm is generally larger than that of RSA or elliptic curve cryptosystems [2]. It is one of important research fields in cryptography to improve the computational cost of pairings. Supersingular curves with characteristic three has embedding degree 6, so that it is particularly efficient for higher security. Some efficient variations of Miller algorithm in base three have been proposed for computing Tate pairing on supersingular elliptic curves over characteristic three [2,10]. Duursma and Lee proposed a closed form generated by divisor gR = 3(R) + (−3R) − 4(O) for a point R, which S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 1–15, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
2
M. Shirase et al.
can efficiently compute Tate pairing [8]. Barreto et. al. then proposed the ηT pairing which can reduce the iteration number of the main loop of DuursmaLee algorithm [3]. The computational cost of the ηT pairing is about half of the Duursma-Lee algorithm. The ηT pairing is currently one of the fastest algorithm for computing the bilinear pairing. It is easy to convert between Tate pairing and the ηT pairing (see [3] or [4] for details). This paper focuses on the ηT pairing defined over finite field F3n . Extension degree n of F3n has to satisfy the following conditions due to several attacks: n is an odd prime number, l is a large prime number with l (36n − 1), where l is the order of the subgroup of the elliptic curve used in pairing. The extension degrees that satisfy these conditions are n = 97, 163, 167, 193, 239, 313, 353, .. On the other hand the explicit algorithm for computing the ηT pairing in [3] deals only with n ≡ 1 (mod 12). Therefore, the previous researches on the ηT pairing have been implemented in the case of n ≡ 1 (mod 12) [3,4,14]. To our knowledge there is no literature that proposes the ηT pairing over F3n for general extension degree n.1 Note that we should modify it if we try to construct an explicit algorithm for n 0003≡ 1 (mod 12), namely n = 163, 167, 239, 353, .. It is relatively complicated to construct an explicit algorithm for n 0003≡ 1 (mod 12). In this paper we present an explicit algorithm for arbitrary prime number n with gcd(n, 6) = 1. The proposed explicit algorithm depends on the extension degree n and the coefficients of the underlying curves, which is not suitable for implementers of the ηT pairing. Therefore this paper proposes the universal ηT pairing whose algorithm does not depend on n and whose computational cost is same as the original ηT pairing. Moreover we present the explicit relationship between Tate pairing and the universal ηT pairing, which make the universal ηT pairing compatible with Tate pairing for arbitrary extension degree n. The remainder of this paper is organized as follows: In Section 2 we explain about the known properties of the ηT pairing. In Section 3 we describe the proposed algorithms including an explicit algorithm for computing the ηT pairing over arbitrary degree n and the universal ηT pairing. Proposition 1 shows the relationship between Tate pairing and the universal ηT pairing. We then present some timings of the universal ηT pairing in C language. In Section 4 we present the proof of Proposition 1 and the correctness of algorithms appeared in Section 3. In Section 5 we conclude this paper.
2
Tate Pairing Over Supersingular Curve with Characteristic Three
Let F3n be an extension field over F3 of degree n. Let E b be the supersingular elliptic curve defined by y 2 = x3 − x + b with b ∈ {1, −1}. All supersingular curves are isomorphic to this curve. The set of all points on E b over F3n defined by E b (F3n ) = {(x, y) ∈ F3n × F3n : y 2 = x3 − x + b} ∪ {O}, 1
In the case of the ηT pairing over F2n , MIRACL supports the general extension degree using 4 branches [13].
Universal ηT Pairing Algorithm over Arbitrary Extension Degree
3
forms a group, where O is the point at infinity. Note that the extension degree n should be gcd(n, 6) = 1, it then satisfies n ≡ 1, 5, 7, 11 (mod 12) [3]. In this paper we deal with the arbitrary degree n with gcd(n, 6) = 1. We define b0003 as 0003 b if n ≡ 1, 11 (mod 12), b0003 = (1) −b if n ≡ 5, 7 (mod 12), then it is known that #E b (F3n ) = 3n + 1 + b0003 3(n+1)/2 . 2.1
(2)
Tate Pairing
Let l be a large prime number, l #E b (F3n ) and l (36n − 1). Let P ∈ E b (F3n )[l] and let Q ∈ E b (F36n )/lE b (F36n ). Then Tate pairing e(P, Q) over E b (F3n ) is a pairing, e : E b (F3n )[l] × E b (F36n )/lE b (F36n ) → F3∗6n /(F3∗6n )l , and defined as e(P, Q) = fP,l (Q), where fP,N is a function whose divisor is (fP,N ) = (N − 1)(P ) − ((N − 1)P ) − (N − 2)(O) for any positive integer N . Since e(P, Q) ∈ F3∗6n /(F3∗6n )l , we require an arithmetic on F36n . A basis {1, σ, ρ, σρ, ρ2 , σρ2 } of F36n over F3n gives an efficient arithmetic on F36n , where σ and ρ satisfy σ = −1 and ρ3 = ρ + b. For a point Q = (x, y) ∈ E b (F3n ) the distortion map ψ is one-to-one homomorphism defined by ψ(x, y) = (ρ − x, yσ) in E b (F36n ).
(3)
Then e(P, ψ(Q)) is defined for P, Q ∈ E b (F3n ). Note that the representation of e(P, ψ(Q)) has ambiguity since e(P, ψ(Q)) is contained in a coset of the residue group F3∗6n /(F3∗6n )l . In order to remove this ambiguity, the final exponentiation is 6n required, which is a powering by (36n − 1)/l. Here we denote e(P, ψ(Q))(3 −1)/l by eˆ(P, Q), then eˆ(P, Q) has bilinearity, namely eˆ(aP, Q) = eˆ(P, aQ) = eˆ(P, Q)a for any non zero integer a. The bilinearity is used in many new cryptographic applications such as identity-based cryptosystem [5], short signature [7] and efficient broadcast cryptosystem [6]. Miller proposed an efficient algorithm for computing fP,l (ψ(Q)) on arbitrary elliptic curve over arbitrary field [12]. Barreto et. al. [2] and Galbraith et. al. [10] proposed Miller algorithm in base three using the following calculation of function f at point Q ∈ E b (F3n ), f ← f 3 · (l1 l2 )(Q), where l1 , l2 are a tangent line of E b at Q and a line going through Q and 2Q, respectively. Miller algorithm in base three is suitable for pairing on E b (F3n ) since cubing operation and a computation of 3Q are virtually for free. Note that 3Q for Q = (xq , yq ) ∈ E b (F3n ) is calculated as follows: 3Q = (x9q − b, −yq9 ) = φπ 2 (Q),
(4)
where π is the 3rd-power Frobenius map on E b , namely π(Q) = (x3q , yq3 ), and φ is a map defined as (5) φ(xq , yq ) = (xq − b, −yq ).
4
2.2
M. Shirase et al.
Duursma-Lee Algorithm
There is an important property of Tate pairing [10]. Let m be an integer such 6n 6n that l m and m (36n − 1). Then fP,m (ψ(Q))(3 −1)/m = fP,l (ψ(Q))(3 −1)/l = eˆ(P, Q). Duursma and Lee effectively used this property to propose a closed algorithm for computing Tate pairing on supersingular curves [8]. We know that l (33n +1) and (33n + 1) (36n − 1) due to Eq.(2) and l #E(F3n ). In the algorithm 33n + 1 is set to N , where the Hamming weight of N in base three is very sparse. Duursma and Lee then showed that the function l1 l2 of Miller algorithm in base three is equivalent to an explicit function gR (x, y) = yr3 y − (x3r + x − b)2 ,
(6)
whose divisor is (gR ) = 3R + (−3R) − 4(O) for R = (xr , yr ). The function gR can be utilized to compute a function fP,3k +1 for any positive integer k [11], k−1 3k−2 · · · g33k−2 P g3k−1 P . (7) fP,3k +1 = gP3 g3P 00043n 3n−i . Setting k = 3n in Eq.(7), we see fP,33n +1 (ψ(Q)) = i=1 (g3i−1 P (ψ(Q)))3 Therefore we obtain n 0005 fP,33n +1 (ψ(Q)) = gπi (P ) (π n+1−i (ψ(−Q)) (8) i=1 3n−i
2n−i
n−i
= (g3i−n−1 P (ψ(Q)))3 = (g3i−2n−1 P (ψ(Q)))3 . due to (g3i−1 P (ψ(Q)))3 The explicit description of Duursma-Lee algorithm is derived from Eq. (8) and thus it has n iterations in the main loop. 2.3
ηT Pairing
Barreto et. al. [3] proposed the ηT pairing to decrease the iteration number of Duursma-Lee algorithm. Here we describe the ηT pairing on supersingular curve over characteristic three. Let T be an integer such that T = 3(n+1)/2 + b0003 .
(9)
b
Then the ηT (P, Q) for P, Q ∈ E (F3n ) is defined as ηT (P, Q) = f−P,T (ψ(Q)) if b0003 = 1 and ηT (P, Q) = fP,T (ψ(Q)) otherwise. 0004 (n+1)/2−i Setting k = (n + 1)/2 in Eq.(7), we see fP,3(n+1)/2 +1 = (n+1)/2 g33i−1 . i=1 Barreto et. al. showed that the difference between f±P,T and fP,3(n+1)/2 +1 is represented by a function of a line l3P 0002 ,b0002 P going through 3P 0003 and b0003 P , where 0004(n−1)/2 (n−1)/2−i P 0003 = 3(n−1)/2 P . Then ηT (P, Q) = l3P 0002 ,b0002 P (ψ(Q)) j=0 g3i P (ψ(Q))3 . Moreover it can be rewritten as 0005
(n−1)/2
ηT (P, Q) = l
3P 0002 ,b0002 P
(ψ(Q))
j=0
to remove the exponent 3(n−1)/2 .
j
g3−j P 0002 (ψ(Q))3 ,
(10)
Universal ηT Pairing Algorithm over Arbitrary Extension Degree
5
Eq.(10) is similar to Eq.(8), but only has (n + 1)/2 iterations, which means the cost of ηT pairing is about half of Duursma-Lee algorithm. Note that T = 3(n+1)/2 ± 1 is as large as #E b (F3n ) − 3n − 1 , which is the absolute value of the trace of E b (F3n ). b ηT (P, Q) itself is contained in a coset of the residue group F3∗6n /(F3∗6n )#E (F3n ) . Therefore one cannot use ηT (P, Q) in cryptographic protocols due to its ambiguity. ηT (P, Q) requires the final exponentiation of powering by W to be a bilinear pairing, where W is an integer defined as W = (33n − 1)(3n + 1)(3n + 1 − b0003 (3(n+1) )) (= (36n − 1)/#E b (F3n ) ).
(11)
There is an efficient algorithm for computing the final exponentiation in [15]. Let Z be an integer such that Z = −b0003 3(n+3)/2 .
(12)
Then there is a relationship between the ηT pairing and Tate pairing, 2
(ηT (P, Q)W )3T = eˆ(P, Q)Z .
(13)
It is essential to find an algorithm for computing eˆ(P, Q)X for some integer X that becomes a bilinear pairing. However, if we need to convert the ηT pairing to Tate pairing via Eq.(13), there is an efficient conversion algorithm, see [4]. Note that the original algorithm for computing the ηT pairing in [3] includes computations of cube root computations. In general it takes the cost of 0.8∼2 multiplications [1], then we cannot neglect their costs. Beuchat et. al. generated an algorithm (Algorithm 2 in [4]) that has no cube root and outputs (n+1)/2 . ηT (P, Q)3
3
Proposed Explicit Algorithms
In this section we present an explicit algorithm for computing the ηT pairing with arbitrary extension degree n. We then propose the universal ηT pairing whose algorithm has no branch in the program. 3.1
ηT Pairing for Arbitrary n
An algorithm for computing the ηT pairing with arbitrary extension degree n can be constructed from Eq.(10). Since both l3P 0002 ,b0002 P and g3−j P 0002 depend on the extension degree n and the curve parameter b0003 , the explicit description of ηT pairing has a complex form and causes many branches in the program. Lemma 5 of [3] explains about l3P 0002 ,b0002 P in all cases, however g3−j P 0002 is considered only for n ≡ 1 (mod 12). In this section we investigate g3−j P 0002 in details. Note that g3−j P 0002 needs a computation of P 0003 = 3(n−1)/2 P . In order to efficiently compute P 0003 we use Eq.(4), then we see P 0003 = φ(n−1)/2 π (n−1) (P ).
(14)
6
M. Shirase et al. (n+1)/2
Algorithm 1. Computation of ηT (P, Q)3 for arbitrary n input: P = (xp , yp ), Q = (xq , yq ) ∈ E b (F3n ) b 3(n+1)/2 output: (η ∈ F3∗6n /(F3∗6n )#E (F3n ) 0003T (P, Q)) b if n ≡ 1, 11 (mod 12) 1. b0003 ← −b if n ≡ 5, 7 (mod 12) 2. if b0003 =⎧ 1 then yp ← −yp −yp (xp + xq + b) + yq σ + yp ρ if n ≡ 1 (mod 12) ⎪ ⎪ ⎨ −yp (xp + xq − b) + yq σ + yp ρ if n ≡ 5 (mod 12) 3. R0 ← ⎪ yp (xp + xq + b) + yq σ − yp ρ if n ≡ 7 (mod 12) ⎪ ⎩ 0003 yp (xp + xq − b) + yq σ − yp ρ if n ≡ 11 (mod 12) b if n ≡ 1, 7 (mod 12) 4. d ← −b if n ≡ 5, 11 (mod 12) 5. for i ← 0 to (n − 1)/2 do 6. r0 ← x0003p + xq + d −r02 + yp yq σ − r0 ρ − ρ2 if n ≡ 1, 5 (mod 12) 7. R1 ← −r02 − yp yq σ − r0 ρ − ρ2 if n ≡ 7, 11 (mod 12) 8. R0 ← R0 R1 9. yq ← −yp 10. xq ← x9q , yq ← yq9 11. R0 ← R03 12. d ← d − b (mod 3) 13. end for 14. return R0
The explicit description of P 0003 depends on not only the extension degree n but also the curve parameter b arisen from φ in Eq. (5). We present Algorithm 1 which is an explicit algorithm for computing the ηT pairing with arbitrary n. The proposed explicit algorithm is based on the variation of the ηT pairing discussed by Beuchat et. al. [4] which has no cube root computation for n ≡ 1 (mod 12). Refer Section 4.1 for a proof of the correctness of Algorithm 1. The branches in Steps 1-4 and Step 7 are caused by l3P 0002 ,b0002 P (Lemma 5 of [3]) and g3−j P 0002 , respectively. 3.2
Universal ηT Pairing
Algorithm 1 has many branches that depend on the value of (n mod 12) and b0003 . If there is an algorithm without branches, then it becomes more implementorfriendly. Therefore Section 3.2 proposes the universal ηT pairing, η0002 T (P, Q), that has no branch and is as efficient as the original ηT pairing. The proposed algorithm is given by Algorithm 2. The following proposition describes the difference between the ηT pairing (Algorithm 1) and the η0002 T pairing (Algorithm 2).
Universal ηT Pairing Algorithm over Arbitrary Extension Degree
7
Algorithm 2. Computation of η0002 T (P, Q) for arbitrary n input: P = (xp , yp ), Q = (xq , yq ) ∈ E b (F3n ) ∗ ∗ #E b (F3n ) output: η0002 T (P, Q) ∈ F36n /(F36n ) 1. R0 ← −yp (xp + xq + b) + yq σ + yp ρ 2. d ← b 3. for i ← 0 to (n − 1)/2 do 4. r0 ← x p + x q + d 5. R1 ← −r02 + yp yq σ − r0 ρ − ρ2 6. R0 ← R0 R1 7. yq ← −yp 8. xq ← x9q , yq ← yq9 9. R0 ← R03 10. d ← d − b (mod 3) 11. end for 12. return R0
Proposition 1. Let n be an odd prime with gcd(n, 6) = 1, and let T , W and Z be integers defined as Eqs.(9), (11) and (12), respectively. Then we have the b following properties of η0002 T (P, Q) for P, Q ∈ E (F3n ). W with final exponentiation W is a non-degenerate and bilinear pairing. (i) η0002 T (P, Q) W (ii) η0002 = eˆ(P, Q)U , where U = (3(n−1)/2 · V ZT −2 mod #E b (F3n )) and T (P, Q) V is defined by the following table. b=1 b = −1 n ≡ 1 (mod 12) −1 1 n ≡ 5 (mod 12) 3(n+1)/2 − 2 3(n+1)/2 + 2 n ≡ 7 (mod 12) −1 1 n ≡ 11 (mod 12) −3(n+1)/2 − 2 −3(n+1)/2 + 2
The proof of Proposition 1 is described in Section 4.2. The final exponentiation for η0002 T (P, Q) is same as that for the ηT (P, Q) pairing, which is efficiently computed by the algorithm from [15]. Due to Proposition 1-(i) we can apply the η0002 T pairing to cryptographic applications with a bilinear pairing. If necessary, we can obtain Tate pairing eˆ(P, Q) from η0002 T (P, Q) due to Proposition 1-(ii). W Note that η0002 (P, Q) is included in the torus T2 (F33n ). Therefore the converT W (P, Q) to e ˆ (P, Q), a powering by U −1 , can be efficiently performed sion of η0002 T with arithmetic in T2 (F33n ), refer to [15]. Moreover, the proposed η0002 T pairing has good properties, namely it has no branch and no cube root computation unlike the original ηT pairing. The η0002 T pairing is as efficient as the variation of ηT pairing, which is one of the fastest implementations of a bilinear pairing [4]. 3.3
Implementation Results
We implemented the η0002 T pairing (Algorithm 2) in C language. It is implemented on an AMD OpteronTM Processor 275 at 2.2GHz using 8GByte RAM.
8
M. Shirase et al. Table 1. Timing of operations on F3n and computation of the η0002 T pairing (μsec) Extension degree (n) Addition Cubing Multiplication Inversion W η0002 (Alg.2+[15]) T
97(SSE) 0.0083 0.0394 0.5009 7.7111 479.63
97 0.0168 0.1610 1.2056 12.0865 1164.16
167 0.0210 0.2104 2.9757 28.6980 4406.26
193 0.0237 0.2694 3.7164 39.7646 6267.99
239 0.0265 0.3052 5.3137 55.5295 10753.17
313 0.0377 0.3943 8.2219 95.9911 21796.96
We mainly follow the implementation described in [9]. The polynomial base representation is used for F3n . Finite field F3 = {0, 1, 2} is encoded by two bits, and an addition in F3 is programmed by 7 Boolean logic operations [9]. We implemented the multiplication by the right-to-left sfift-addition algorithm with the signed window method of width 3. The extended Euclidean algorithm is used for the inversion. We deploy the final exponentiation using the torus proposed by Shirase et. al. [15]. Table 1 presents the timing of the η0002 T pairing for different extension degrees n = 97, 167, 193, 239, 313. The timing is an average value for 1,000,000 randomly chosen elements on the base field F3n or elliptic curve E b (F3n ). If we choose about twice larger extension degree, then the η0002 T pairing becomes about 5 times slower. The η0002 T pairing with n = 313 can be implemented in about 20 milliseconds. In case of n = 97 we optimized our programming suitable for the streaming SIMD extensions (SSE). The timing using SSE for the η0002 T pairing with n = 97 achieves under 0.5 milliseconds, which is more than twice as fast than the implementation without SSE. The embedded field of extension degree n is F36n , and their bit size are 923, 1589, 1836, 2273, 2977 for n = 97, 167, 193, 239, 313, respectively.
4
Proofs of Proposition and Algorithm
We prove the Proposition 1 and the correctness of Algorithm 1 described in this paper. 4.1
Proof of Algorithm 1
In order to prove the correctness of Algorithm 1 we introduce Algorithm 3 which is an extension of the original ηT (P, Q) [3] to arbitrary extension de(Alg.1) (Alg.3) gree n. Denote by R0,j and R0,j the value in register R0 at the j-th loop of Algorithm 1 and Algorithm 3, respectively. They are related by equation (Alg.1) (Alg.3) 3j+1 R0,j = (R0,j ) (see also Appendix II in [4]). Therefore we see that if (n+1)/2
Algorithm 3 outputs ηT (P, Q) then Algorithm 1 outputs ηT (P, Q)3 . Then it is sufficient to prove the correctness of Algorithm 3. Recall that ηT (P, Q) for arbitrary n is defined using two values, l3P 0002 ,b0002 P (ψ(Q)) j j and g3−j P 0002 (φ(Q))3 . We prove that g3−j P 0002 (φ(Q))3 , which corresponds to the jth loop of Step 5 and 6 in Algorithm 3, can be computed by Lemma 1.
Universal ηT Pairing Algorithm over Arbitrary Extension Degree
9
Algorithm 3. Computation of ηT (P, Q) for arbitrary n (Including cube root version) input: P = (xp , yp ), Q = (xq , yq ) ∈ E b (F3n ) b output: η0003T (P, Q) ∈ F3∗6n /(F3∗6n )#E (F3n ) b if n ≡ 1, 11 (mod 12) 1. b0003 ← −b if n ≡ 5, 7 (mod 12) 1 then yp ← −yp 2. if b0003 =⎧ ⎪ −yp (xp + xq + b) + yq σ + yp ρ if n ≡ 1 (mod 12) ⎪ ⎨ −yp (xp + xq − b) + yq σ + yp ρ if n ≡ 5 (mod 12) 3. R0 ← yp (xp + xq + b) + yq σ − yp ρ if n ≡ 7 (mod 12) ⎪ ⎪ ⎩ yp (xp + xq − b) + yq σ − yp ρ if n ≡ 11 (mod 12) 4. for i ← 00003to (n − 1)/2 do xp + xq + b if n ≡ 1, 7 (mod 12) 5. r0 ← 0003xp +2 xq − b if n ≡ 5, 11 2(mod 12) −r0 + yp yq σ − r0 ρ − ρ if n ≡ 1, 5 (mod 12) 6. R1 ← −r02 − yp yq σ − r0 ρ − ρ2 if n ≡ 7, 11 (mod 12) 7. R0 ← R0 R1 1/3 1/3 8. x p ← x p , yp ← yp 3 3 9. x q ← x q , yq ← yq 10. end for 11. return R0
Lemma 1. Let n be an odd prime. Then
(−j) (j) −r02 + yp yq σ − r0 ρ − ρ2 if n ≡ 1 (mod 4), 3j g3−j P 0002 (φ(Q)) = (−j) (j) 2 −r0 − yp yq σ − r0 ρ − ρ2 if n ≡ 3 (mod 4), for P = (xp , yp ), Q = (xq , yq ) ∈ E b (F3n ), where r0 is defined as 0003 xp + xq + b if n ≡ 1 (mod 6), r0 = xp + xq − b if n ≡ 5 (mod 6). 0002
Proof. See the appendix. Next we have l
3P 0002 ,b0002 P
0003 (x, y) =
y + yp (x − xp ) − b0003 yp if n ≡ 1, 5 (mod 12), y − yp (x − xp ) − b0003 yp if n ≡ 7, 11 (mod 12).
(15)
from Lemma 5 of [3]. Therefore the formula for R0 in Steps 1 and 3 can be obtained due to Eqs. (1) and (15). Therefore we prove the correctness of Algorithm 3. 4.2
Proof of Proposition 1
We first prove Proposition 1-(ii). Let ηT (P, Q) be the output of Algorithm 4. (Alg.2) (Alg.4) Denote by R0,j and R0,j the value in register R0 at the j-th loop of Algorithm 2 and Algorithm 4, respectively. Then we see that (n+1)/2
3 η0002 T (P, Q) = ηT (P, Q)
,
(16)
10
M. Shirase et al. Algorithm 4. Computation of ηT (P, Q) for arbitrary n input: P = (xp , yp ), Q = (xq , yq ) ∈ E b (F3n ) b output: ηT (P, Q) ∈ F3∗6n /(F3∗6n )#E (F3n ) 1. R0 ← −yp (xp + xq + b) + yq σ + yp ρ 2. for i ← 0 to (n − 1)/2 do 3. r0 ← x p + x q + b 4. R1 ← −r02 + yp yq σ − r0 ρ − ρ2 5. R0 ← R0 R1 1/3 1/3 6. x p ← x p , yp ← yp 3 3 7. x q ← x q , yq ← yq 8. end for 9. return R0 (Alg.2)
(Alg.4)
j+1
since R0,j = (R0,j )3 . Due to Eqs.(13) and (16), it is enough to prove W that ηT (P, Q) = ηT (P, Q)V W . The difference between Algorithm 3 and 4 causes the corresponding difference between ηT (P, Q) and ηT (P, Q). There are two differences, the first difference is that Algorithm 3 has the program “ if b0003 = 1 then yp ← −yp ”,
(17)
and the second difference is that Algorithm 3 has the branches at Steps 1, 3, 5 and 6. In order to investigate the first difference, we modify Algorithm 4 by appending the program (17) before Step 1. We call this modified algorithm as Algorithm 4’, and denote by ηT0003 (P, Q) the pairing value from Algorithm 4’. The relationship between ηT (P, Q) and ηT0003 (P, Q) is obtained by Lemma 2. Lemma 2. We have W
ηT (P, Q)
0003 =
(ηT0003 (P, Q)W )−1 if b0003 = 1 ηT0003 (P, Q)W if b0003 = −1
Proof. We see that ηT is identical to ηT0003 if b = −1. On the other hand, ηT is different from ηT0003 if b = 1. We obtain ηT (P, Q) = ηT0003 (−P, Q) since −P = (xp , −yp ) for P = (xp , yp ) ∈ E b (F3n ). Bilinearity of ηT (P, Q)W products a relationship ηT (P, Q)W = ηT0003 (−P, Q)W = (ηT0003 (P, Q)W )−1 . 0002 Remark 1. ηT without the powering by W is not bilinear pairing. Then the powering by W is required in the statement of Lemma 2. The second difference causes the difference between ηT0003 (P, Q) and ηT (P, Q). We soon see that ηT0003 (P, Q) = ηT (P, Q) if n ≡ 1 (mod 12). When n ≡ 5 (mod 12), a converting xq → xq − b, in other words Q → φ4 (Q), in Algorithm 2 gives Algorithm 4. Then ηT0003 (P, Q) = ηT (P, φ4 (Q)) if n ≡ 5 (mod 12). We easily see also that the relationship between ηT0003 (P, Q) and ηT (P, Q) for n ≡ 7, 11 (mod 12). Then we have
Universal ηT Pairing Algorithm over Arbitrary Extension Degree
ηT0003 (P, Q)
11
⎧ η (P, Q) ⎪ ⎨ T 4
if n ≡ 1 (mod 12) ηT (P, φ (Q))(= ηT (P, −φ(Q))) if n ≡ 5 (mod 12) = if n ≡ 7 (mod 12) ⎪ ⎩ ηT (P, −Q) if n ≡ 11 (mod 12) ηT (P, φ(Q))
Due to Lemma 2 and Eq.(18) we ηT (P, Q)W , ⎧ W −1 if ⎪ ⎪ (ηT (P, Q)W ) ⎪ ⎪ η (P, Q) if T ⎪ ⎪ W −1 ⎪ ⎪ (P, −φ(Q)) ) if (η T ⎪ ⎨ ηT (P, −φ(Q))W if W ηT (P, Q) = W −1 (η if T (P, −Q) ) ⎪ ⎪ ⎪ ⎪ ηT (P, −Q)W if ⎪ ⎪ ⎪ ⎪ if (ηT (P, φ(Q))W )−1 ⎪ ⎩ W ηT (P, φ(Q))
(18)
have the relationship ηT (P, Q)W and
n ≡ 1 (mod 12), n ≡ 1 (mod 12), n ≡ 5 (mod 12), n ≡ 5 (mod 12), n ≡ 7 (mod 12), n ≡ 7 (mod 12), n ≡ 11 (mod 12), if n ≡ 11 (mod 12),
b0003 b0003 b0003 b0003 b0003 b0003 b0003 b0003
=1 = −1 =1 = −1 =1 = −1 =1 = −1
(b = 1) (b = −1) (b = −1) (b = 1) (b = −1) (b = 1) (b = 1) (b = −1)
(19)
Lastly in order to show that φ is a homomorphism of E b (F3n ), we show that φ is represented as a scalar multiplication. Lemma 3. For P ∈ E b (F3n ), φ(P ) is equal to a value in the following table. b=1 b = −1 n ≡ 1 (mod 12) 3n P 3n P (n+1)/2 (n+1)/2 n ≡ 5 (mod 12) (−3 + 2)P (3 + 2)P n ≡ 7 (mod 12) 3n P 3n P n ≡ 11 (mod 12) (3(n+1)/2 + 2)P (−3(n+1)/2 + 2)P
Proof. See the appendix.
0002
Here we go back to the proof of Proposition 1. Lemma 3, Eq.(19), and the bilinearity of ηT (P, Q)W yield Proposition 1-(ii). Finally we prove Proposition 1-(i) in the following. V and 3(n+1)/2 are coprime to #E b (F3n ) with V = ±1, 3(n+1) ± 2, −3(n+1) ± 2, which means that a powering by 3(n+1)/2 · V is a group isomorphism in F3∗6n . The ηTW is a non-degenerate and bilinear pairing, (n+1)/2 VW then the ηT W (= ηT3 ) is also a non-degenerate and bilinear pairing.
5
Conclusion
This paper provided an explicit algorithm for computing the ηT pairing with arbitrary degree n. It has many branches based on extension degree n and the curve parameter b. Therefore, this paper also proposed the universal ηT pairing (0002 ηT pairing) which has no branch in the program and is suitable for the efficient implementation for arbitrary extension degree n. Moreover we proved the relationship between the η0002 T pairing and the Tate pairing for arbitrary n. Finally we summarize the relationship of pairings appeared in this paper in the following table.
12
M. Shirase et al. Pairing Properties eˆ(P, Q) no branch and no cube root ([11]) ⏐ Eq.(13)
W ηT (P, branches and cube roots (Sec.3.1) Q) ⏐
powering by V Proposition 1) W ηT (P, no branch and cube roots (Sec.4.2) Q) ⏐
powering by 3(n+1)/2 W η0002 no branch and no cube root (Sec.3.2) T (P, Q)
References 1. Barreto, P.: A note on efficient computation of cube roots in characteristic 3, Cryptology ePrint Archive, Report 2004/305 (2004) 2. Barreto, P., Kim, H., Lynn, B., Scott, M.: Efficient algorithms for pairing-based cryptosystems. In: Yung, M. (ed.) CRYPTO 2002. LNCS, vol. 2442, pp. 354–368. Springer, Heidelberg (2002) ´ hEigeartaigh, ´ 3. Barreto, P., Galbraith, S., O C., Scott, M.: Efficient pairing computation on supersingular abelian varieties. In: Designs, Codes and Cryptography, vol. 42(3), pp. 239–271. Springer, Heidelberg (2007) 4. Beuchat, J.-L., Shirase, M., Takagi, T., Okamoto, E.: An algorithm for the ηT pairing calculation in characteristic three and its hardware implementation. In: 18th IEEE International Symposium on Computer Arithmetic, ARITH-18, pp. 97–104 (2007) full version, Cryptology ePrint Archive, Report 2006/327 (2006) 5. Boneh, D., Franklin, M.: Identity based encryption from the Weil pairing. SIAM Journal of Computing 32(3), 586–615 (2003) 6. Boneh, D., Gentry, C., Waters, B.: Collusion resistant broadcast encryption with short ciphertexts and private keys. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 258–275. Springer, Heidelberg (2005) 7. Boneh, D., Lynn, B., Shacham, H.: Short signature from the Weil pairing. Journal of Cryptology 17(4), 297–319 (2004) 8. Duursma, I., Lee, H.: Tate pairing implementation for hyperelliptic curves y 2 = xp − x + d. In: Laih, C.-S. (ed.) ASIACRYPT 2003. LNCS, vol. 2894, pp. 111–123. Springer, Heidelberg (2003) 9. Granger, R., Page, D., Stam, M.: Hardware and software normal basis arithmetic for pairing-based cryptography in characteristic three. IEEE Transactions on Computers 54(7), 852–860 (2005) 10. Galbraith, S., Harrison, K., Soldera, D.: Implementing the Tate pairing. In: Fieker, C., Kohel, D.R. (eds.) Algorithmic Number Theory. LNCS, vol. 2369, pp. 324–337. Springer, Heidelberg (2002) 11. Kwon, S.: Efficient Tate pairing computation for supersingular elliptic curves over binary fields, Cryptology ePrint Archive, Report 2004/303 (2004) 12. Miller, V.: Short programs for functions on curves, Unpublished manuscript (1986), http://crypto.stanford.edu/miller/miller.pdf 13. MIRACL, ftp://ftp.computing.dcu.ie/pub/crypto/miracl.zip ´ ´ Murphy, C., Kerins, T., Barreto, P.: A reconfig14. Ronan, R., hEigeartaigh, C.O., urable processor for the cryptographic ηT pairing in characteristic 3. In: Information Technology: New Generations, ITNG 2007, pp. 11–16. IEEE Computer Society, Los Alamitos (2007)
Universal ηT Pairing Algorithm over Arbitrary Extension Degree
13
15. Shirase, M., Takagi, T., Okamoto, E.: Some efficient algorithms for the final exponentiation of ηT pairing. In: ISPEC 2007. LNCS, vol. 4464, pp. 254–268. Springer, Heidelberg (2007) 16. Silverman, J.: The arithmetic of elliptic curves. Springer, Heidelberg (1986)
A
Some Lemmas
In the appendix we prove two lemmas appeared in this paper. Lemma 1. Let n be an odd prime. Then
(−j) (j) −r02 + yp yq σ − r0 ρ − ρ2 if n ≡ 1 (mod 4), 3j g3−j P 0002 (φ(Q)) = (−j) (j) 2 −r0 − yp yq σ − r0 ρ − ρ2 if n ≡ 3 (mod 4), for P = (xp , yp ), Q = (xq , yq ) ∈ E b (F3n ), where r0 is defined as 0003 xp + xq + b if n ≡ 1 (mod 6), r0 = xp + xq − b if n ≡ 5 (mod 6). Proof. First we inspect how P 0003 = 3(n−1)/2 P that ⎧ (x, y) if ⎪ ⎪ ⎪ (x − b, −y) if ⎪ ⎪ ⎨ (x + b, y) if φs (x, y) = (x, −y) if ⎪ ⎪ ⎪ ⎪ (x − b, y) if ⎪ ⎩
(Eq.(14)) is represented. We see
s≡0 s≡1 s≡2 s≡3 s≡4 (x + b, −y) if s ≡ 5
for any (x, y) ∈ E b (F3n ) and any integer s due to ⎧ 0 φ (x, y) = (x, y) ⎪ ⎨ 2 φ (x, y) = (x + b, y) (n−1)/2 φ (x, y) = 3 ⎪ ⎩ φ5 (x, y) = (x, −y)
(mod (mod (mod (mod (mod (mod
6), 6), 6), 6), 6), 6),
(20)
Eq.(5). Then we have
if if if φ (x, y) = (x + b, −y) if
n ≡ 1 (mod 12), n ≡ 5 (mod 12), n ≡ 7 (mod 12), n ≡ 11 (mod 12).
i
The notation of a(i) means a3 . We see that π n (xp , yp ) = (xp , yp )
(21)
for P = (xp , yp ) ∈ E b (F3n ) since xp and yp ∈ F3n . Then we have π n−1 (xp , yp ) = (−1) (−1) (xp , yp ). Therefore we see ⎧ (−1) (−1) ⎪ (xp , yp ) ⎪ ⎪ ⎨ (−1) (−1) + b, yp ) (xp 0003 P = (−1) (−1) ⎪ (xp , −yp ) ⎪ ⎪ ⎩ (−1) (−1) + b, −yp ) (xp
if if if if
n ≡ 1 (mod 12), n ≡ 5 (mod 12), n ≡ 7 (mod 12), n ≡ 11 (mod 12).
Note that φ3 (x, y) = −(x, y),
(22)
14
M. Shirase et al.
due to −(x, y) = (x, −y) and Eq.(20). Next we use induction for j to prove Lemma 1. Definition equations of gR and ψ, Eqs.(3) and (6), are utilized to prove Lemma 1 for j = 0. Case of n ≡ 1 (mod 12): By P 0003 = (xp
(−1)
(−1)
, yp
),
gP 0002 (ψ(Q)) = (yp(−1) )3 yq σ − ((x(−1) )3 − (ρ − xq ) + b)2 p = yp yq σ − (xp + xq + b − ρ)2 = −r02 + yp yq σ − r0 ρ − ρ2 , where r0 = xp + xq + b. Case of n ≡ 5 (mod 12): By P 0003 = (xp
(−1)
(−1)
+ b, yp
),
gP 0002 (ψ(Q)) = (yp(−1) )3 yq σ − ((x(−1) + b)3 − (ρ − xq ) + b)2 p = yp yq σ − (xp + xq − b − ρ)2 = −r02 + yp yq σ − r0 ρ − ρ2 , where r0 = xp + xq − b. Case of n ≡ 7 (mod 12): By P 0003 = (xp
(−1)
(−1)
, −yp
),
gP 0002 (ψ(Q)) = (−yp(−1) )3 yq σ − ((x(−1) )3 − (ρ − xq ) + b)2 p = −yp yq σ − (xp + xq + b − ρ)2 = −r02 − yp yq σ − r0 ρ − ρ2 , where r0 = xp + xq + b. Case of n ≡ 11 (mod 12): By P 0003 = (xp
(−1)
(−1)
+ b, −yp
),
gP 0002 (ψ(Q)) = (yp(−1) )3 yq σ − ((x(−1) + b)3 − (ρ − xq ) + b)2 p = −yp yq σ − (xp + xq + b − ρ)2 = −r02 − yp yq σ − r0 ρ − ρ2 , where r0 = xp + xq − b. We complete proving Lemma 1 for j = 0. We suppose that Lemma 1 is held for j = j 0003 . Then we easily see that Lemma 1 is also held for j = j 0003 + 1 with direct computations. 0002 Lemma 3. For P ∈ E b (F3n ), φ(P ) is equal to a value in the following table. b=1 b = −1 n ≡ 1 (mod 12) 3n P 3n P (n+1)/2 (n+1)/2 n ≡ 5 (mod 12) (−3 + 2)P (3 + 2)P n ≡ 7 (mod 12) 3n P 3n P n ≡ 11 (mod 12) (3(n+1)/2 + 2)P (−3(n+1)/2 + 2)P
Proof. Let P = (xp , yp ) be contained in E b (F3n ). Then we can use addition and duplication formulae of elliptic curves (see [16] for details) to obtain equations 0003 π(P ) + 2P = (xp − 1, −yp ) if b = 1, (23) −π(P ) + 2P = (xp + 1, −yp ) if b = −1.
Universal ηT Pairing Algorithm over Arbitrary Extension Degree
15
The following calculations complete the proof. Case of n ≡ 1 (mod 12), b = 1: 3n P = φn π 2n (P ) = φ(P )
by (4), (20), (21)
Case of n ≡ 1 (mod 12), b = −1: 3n P = φn π 2n (P ) = φ(P )
by(4), (20), (21)
Case of n ≡ 5 (mod 12), b = 1: (−3(n+1)/2 + 2)P = = = =
−φ(n+1)/2 π n+1 (P ) + 2P −φ3 π(P ) + 2P π(P ) + 2P φ(P )
by by by by
(4) (20), (21) (22) (23)
Case of n ≡ 5 (mod 12), b = −1: (3(n+1)/2 + 2)P = = = =
φ(n+1)/2 π n+1 (P ) + 2P φ3 π(P ) + 2P −π(P ) + 2P φ(P )
by by by by
(4) (20), (21) (22) (23)
Case of n ≡ 7 (mod 12), b = 1: 3n P = φn π 2n (P ) = φ(P )
by (4), (20), (21)
Case of n ≡ 7 (mod 12), b = −1: 3n P = φn π 2n (P ) = φ(P ) by
(4), (20), (21)
Case of n ≡ 11 (mod 12), b = 1: (3(n+1)/2 + 2)P = φ(n+1)/2 π n+1 (P ) + 2P = π(P ) + 2P = φ(P )
by (4) by (20), (21) by (23)
Case of n ≡ 11 (mod 12), b = −1: (−3(n+1)/2 + 2)P = −φ(n+1)/2 π n+1 (P ) + 2P = −π(P ) + 2P = φ(P ) We complete proving Lemma 3.
by (4) by (20), (21) by (23) 0002
Convertible Undeniable Proxy Signatures: Security Models and Efficient Construction0002 Wei Wu, Yi Mu, Willy Susilo, and Xinyi Huang Centre for Computer and Information Security Research School of Computer Science & Software Engineering University of Wollongong, Australia {wei,ymu,wsusilo,xh068}@uow.edu.au
Abstract. In the undeniable signatures, the validity or invalidity can only be verified via the Confirmation/Disavowal protocol with the help of the signer. Convertible undeniable signatures provide the flexibility that a signer can convert an undeniable signature into publicly verifiable one. A proxy signature scheme allows an entity to delegate his/her signing capability to another entity in such a way that the latter can sign messages on behalf of the former when the former is not available. Proxy signatures have found numerous practical applications in ubiquitous computing, distributed systems, mobile agent applications, etc. In this paper, we propose the first convertible undeniable proxy signature scheme with rigorously proven security. The properties of Unforgeability, Invisibility and Soundness in the context of convertible undeniable proxy signatures are also clearly defined. The security of our construction is formally proven in the random oracle models, based on some natural complexity assumptions. Keywords: Undeniable signatures, Proxy signatures, Convertible, Security models, Security proof.
1
Introduction
Undeniable signatures are like ordinary digital signatures, with the only difference that they are not universally verifiable and the validity or invalidity of an undeniable signature can only be verified via a Confirmation/Disavowal protocol with the help of the signer. Undeniable signatures have found various applications such as in licensing software, electronic voting and auctions. Since its introduction in Cypto’89 [5], there have been a number of schemes in the literature [1,6,7,8,9,11,19,20,21,23,24,25,31,32]. The concept of convertible undeniable signatures was introduced by Boyar, Chaum, Damg˚ ard and Pedersen [1] in Crypto’90. The new concept offers more flexibility than its original version in the sense that the signer has the ability to convert an undeniable signature into publicly verifiable one. Convertible undeniable signatures require two convert algorithms: individually convert algorithm and universally convert 0002
Supported by ARC Discovery Grant DP0557493 and DP0663306.
S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 16–29, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
Convertible Undeniable Proxy Signatures
17
algorithm. The individually convert algorithm allows the signer to convert an undeniable signature into a regular signature by releasing a piece of information at a later time. Moreover, the universally convert algorithm can help the signer to convert all the undeniable signatures into publicly verifiable ones. In this case, anyone can check the validity of any undeniable signature without any help of the signer. The notion of proxy signature was introduced by Mambo, Usuda and Okamoto [22]. Based on the delegation type, there are three types of proxy signatures: full delegation, partial delegation, and delegation by warrant. In the full delegation system, Alice’s secret key is given to Bob directly so that Bob can have the same signing capability as Alice. In practice, such schemes are obviously impractical and insecure. In a partial delegation proxy signature scheme, a proxy signer possesses a key, called private proxy key, which is different from Alice’s private key. Hence, proxy signatures generated by using the private proxy key are different from Alice’s signatures. However, in such schemes, the messages a proxy signer can sign are not limited. This weakness is eliminated in delegation by a warrant that specifies what kinds of messages are delegated. Here, the original signer uses the signing algorithm of a standard signature scheme and its secret key to sign a warrant and generate a signature on the warrant which is called as delegation. The proxy signer uses the delegation and his/her secret key to create a proxy signature on behalf of the original signer. According to whether the original signer can generate a valid proxy signature, proxy signatures can be further classified into proxy-unprotected and proxy-protected schemes. In a proxy-protected scheme only the proxy signer can generate proxy signatures, while in a proxyunprotected scheme either the proxy signer or the original signer can generate proxy signatures. In many applications, proxy-protected schemes are required to avoid the potential disputes between the original signer and the proxy signer. Related works about proxy signature can be found in [4,12,13,16,17,18,28]. Our contribution In this paper, we formalize the security models of the convertible undeniable proxy signatures. The properties Unforgeability, Invisibility and Soundness in the context of convertible undeniable proxy signatures are clearly defined. We then present a concrete construction of the convertible undeniable proxy signature. Our scheme is proxy-protected, even the original signer can not generate a valid convertible undeniable proxy signature. The validity or invalidity of a proxy signature can only be verified with the help of the proxy signer. The proxy signer can decide how to make his proxy signatures publicly verifiable, by releasing an individual proof or universal proof. The security of our construction is formally proved in the random oracle models, based on some natural complexity assumptions. To the best of our knowledge, our construction is the first convertible undeniable proxy signature scheme with rigorously proven security. Paper Organization The rest of this paper is organized as follows. In Section 2, we recall some fundamental backgrounds required throughout the paper. In Section 3, we propose
18
W. Wu et al.
the definition and security models of convertible undeniable proxy signatures. In Section 4, we provide our concrete convertible undeniable proxy signature scheme and its security analysis as well. Finally, we conclude our paper in Section 5.
2
Preliminaries
2.1
Bilinear Maps
Let G1 and GT be two groups of prime order q and let g be a generator of G1 . The map e : G1 × G1 → GT is said to be an admissible bilinear map if the following three conditions hold true: – e is bilinear, i.e. e(g a , g b ) = e(g, g)ab for all a, b ∈ ZZq . – e is non-degenerate, i.e. e(g, g) 0004= 1GT . – e is efficiently computable. We say that (G1 , GT ) are bilinear groups if there exists the bilinear map e : G1 × G1 → GT as above, and e, and the group action in G1 and GT can be computed efficiently. See [2] for more details on the construction of such pairings. 2.2
Complexity Assumptions
Discrete Logarithm Problem: Given (g, g a ) ∈ G, find a. Computational Diffie-Hellman Problem: Given a triple G elements (g, g a , g b ), find the element C = g ab . 3-Decisional Diffie-Hellman (3-DDH) Problem in G1: Given (g, g a , g b , g c , h) ?
∈ G51 for some randomly chosen a, b, c ∈ ZZq , decide whether h = g abc .
3 3.1
Formal Definitions of Convertible Undeniable Proxy Signatures Outline of Convertible Undeniable Proxy Signatures
The convertible undeniable proxy signature scheme consists of the following algorithms: CPGen is a probabilistic algorithm, on input a security parameter k, outputs a string cp which denotes the common scheme parameters including the message space M and the signature space S. KGen is a probabilistic algorithm, on input a common parameter cp, outputs the secret/public key-pairs (sk, pk) for the user in the system. SSign is a probabilistic (deterministic) algorithm, on input the common parameter cp, the signer’s secret key sk and the message m to be signed, outputs the standard signature σs .
Convertible Undeniable Proxy Signatures
19
SVer is a deterministic algorithm, on input the common parameter cp, the signer’s public key pk, the message-signature pair (m, σs ), outputs 1 if it is a valid message-signature pair. Otherwise, outputs 0. DGen is a probabilistic algorithm, on input system’s parameter cp, the original signer’s secret key sko and the warrant W to be signed, outputs the delegation σw . DVer is a deterministic algorithm, on input the common parameter cp, the original signer’s public key pko , the warrant-delegation pair (W, σw ), outputs 1 if it is a valid warrant-delegation pair. Otherwise, outputs 0. UPSign is a probabilistic algorithm, on input system’s parameter cp, the warrant W , the delegation σw , the secret key skp of the proxy signer and the message M to be signed, outputs the undeniable proxy signature σ. UPVer is a deterministic algorithm, on input the common parameter cp, original signer’s public key pko and the proxy signer’s secret/public key-pair (skp , pkp ), the warrant W , the signed message M and the signature σ, outputs 1 if it is a valid undeniable proxy signature. Otherwise, outputs 0. CProtocol is an interactive (non-interactive) algorithm, on input the common parameter cp, the original signer’s public key pko , the proxy signer’s secret key skp , (possibly) the verifier V ’s public key pkv , the warrant W , the message m and the signature σ, outputs a non-transferable transcript T rans which can convince V about σ. DProtocol is an interactive (non-interactive) algorithm, on input the common parameter cp, the original signer’s public key pko , the proxy signer’s secret key skp , (possibly) verifier’s public key pkv , the warrant W , the message m and the signature σ, outputs a non-transferable transcript T rans which can deny σ to V . ICon is a probabilistic (deterministic) algorithm, on input the common parameter cp, the proxy signer’s secret key skp , the warrant W , the message m (m,σ) and the signature σ, outputs an individual proof Π(pkp ,W ) of this message. IVer is a deterministic algorithm, on input the common parameter cp, the original signer’s public key pko , the proxy signer’s public key pkp , the undeniable (m,σ) proxy signature tuple (m, W, σ) and the individual proof Π(pkp ,W ) , outputs the verification decision d ∈ {Acc, Rej}. UCon is a deterministic algorithm, on input the common parameter cp and the proxy signer’s secret key skp , outputs the universal proof Πpkp . UVer is a deterministic algorithm, on input the common parameter cp, the original signer’s public key pko , the proxy signer’s public key pkp , any undeniable proxy signature tuple (m, W, σ) and the universal proof Πpkp , outputs the verification decision d ∈ {Acc, Rej}. 3.2
Adversaries and Oracles
We use a non-interactive designated verifier protocol as the CProtocol/DProtocol when we define the security of a convertible undeniable proxy signature scheme. Note that if the scheme employs the non-interactive confirmation and disavowal
20
W. Wu et al.
protocols, then it is not necessary to consider the active attack [27]. We allow the adversary to access the following oracles and submit their queries adaptively: – KGen Oracle: On a key generation query for the ith user, this oracle runs the KGen algorithm to generate a secret/public key pair (ski , pki ) of this user and the adversary receives the public key pki . – SSign Oracle: On a standard sign query (m, pk), this oracle runs the SSign algorithm to obtain σS and returns it to the adversary. – UPSign Oracle: On an undeniable proxy sign query (m, W, pko , pkp ), this oracle runs the UPSign algorithm to generate the undeniable proxy signature σ and returns it to the adversary. – UPVer Oracle: On a verify query (m, W, σ, pko , pkp ) (and possibly pkv ), this oracle first runs the UPVer algorithm to decide whether (m, W, σ) is a valid undeniable proxy signature tuple under the public keys pko and pkp and outputs the decision result d ∈ {0, 1}. The adversary receives a transcript of CProtocol if d = 1. Otherwise, The adversary receives a transcript of DProtocol. – ICon Oracle: On an individually convert query (m, W, σ, pkp ), this oracle runs (m,σ) the ICon algorithm to generate the individual proof Π(pkp ,W ) and returns it to the adversary. – UCon Oracle: On a universally convert query pkp , this oracle runs UCon algorithm to generate the universal proof Πpkp and returns it to the adversary. – Corruption Oracle: On a corruption query pk, this oracle returns the corresponding secret key to the adversary. 3.3
Completeness
Essentially, completeness means that valid (invalid) signatures can always be proved valid (invalid). It can be described as following two cases: 1. If the UPVer algorithm outputs 1 for an undeniable proxy signature tuple (m, W, σ) under the public keys (pko , pkp ), then (a) (m, W, σ) can be confirmed by the CProtocol. (b) IVer algorithm will output Acc on input (m, W, σ) together with a valid (m,σ) individual proof Π(pkp ,W ) . (c) UVer algorithm will output Acc, on input (m, W, σ) together with a valid universal proof Πpkp . 2. If the UPVer algorithm outputs 0 for a message-signature pair (m, W, σ) under the public keys (pko , pkp ), then (a) (m, W, σ) can be denied by the DProtocol. (b) IVer algorithm will output Rej on input (m, W, σ) together with a valid (m,σ) individual proof Π(pkp ,W ) . (c) UVer algorithm will output Rej on input (m, W, σ) together with a valid universal proof Πpkp .
Convertible Undeniable Proxy Signatures
3.4
21
Non-transferability
The Non-Transferability requires that the transcript of the CProtocol/DProtocol with the designated verifier V can only convince V the validity or invalidity of an undeniable proxy signature tuple (m, W, σ). No one else could be convinced by this T rans even if V shares all his secret information (including his secret key) with this party. The CProtocol/DProtocol is non-transferable if there exists a probabilistic polynomial time algorithm A with input the verifier V ’s secret key skv such that for any other computationally unbounded verifier V0002 , any tuple (m, W, σ), the transcript of the CProtocol/DProtocol generated by A is indistinguishable from that generated by the proxy signer. 3.5
Unforgeability
The standard notion of the security for digital signatures was defined by Goldwasser, Micali and Rivest [10]. In this paper, we use the same way to define the existential unforgeability of the convertible undeniable proxy signature scheme: adversary has access to all the oracles defined in Section 3.2. We say F wins if F outputs a valid undeniable proxy signature tuple (m∗ , Wf , σ ∗f ) under the original signer’s public keys pko and the proxy signer’s public key pkp , such that F has never issued (m∗ , Wf , pko , pkp ) to the UPSign Oracle and one of the following two requirements is satisfied: 1. pkp has never been submitted to the Corruption Oracle. 2. pko has never been submitted to the Corruption Oracle and (Wf , pko ) has not been submitted to the SSign Oracle. The success probability of an adaptively chosen message and chosen public key CMA, CP KA forger F wins the above game is defined as Succ FEUF, CUP S . Definition 1. We say a convertible undeniable proxy signature scheme is unCMA, CP KA forgeable against a (t, qH , qKG , qSS , qUP S , qV , qIC , qUC , qC ) forger FEUF, CUP S , CMA, CP KA if FEUF, CUP S runs in time at most t, makes at most qH queries to the random oracles, qKG queries to KGen Oracle, qSS queries to SSign Oracle, qUP S queries to UPSign Oracle, qV queries to UPVer Oracle, qIC queries to the ICon Oracle, qUC queries CMA, CP KA to the UCon Oracle, qC queries to the Corruption Oracle and Succ FEUF, CUP S is negligible. 3.6
Invisibility
Given an undeniable proxy signature tuple (m, W, σ), the public keys pko of the original signer and pkp of the proxy signer, the invisibility property requires that it is difficult to decide whether it is a valid undeniable proxy signature without (m,σ) the knowledge of the proxy signer’s secret key, the individual proof Π(pkp ,W ) or universal proof Πpkp , even though the adversary knows the secret key sko of the original signer. It is defined using the games between the oracles defined in Section 3.2 and an adaptively chosen message attacker and chosen public key CMA, CP KA distinguisher DIN V, CUP S . The games are divided into two phases.
22
W. Wu et al.
– Phase 1: In this phase, the distinguisher D can adaptively access all the Oracles. – Challenge: When the distinguisher D decides the first phase is over, he submits (m∗ , Wf , pko , pkp ) to UPSign Oracle as the challenge with the constraints that 1. pkp has never been submitted to the Corruption Oracle or UCon Oracle during Phase 1. 2. (m∗ , Wf , pko , pkp ) has never been submitted to the UPSign Oracle during Phase 1. As a response, the UPSign Oracle chooses a random bit γ ∈ {0, 1}. If γ = 0, this oracle will run UPSign algorithm to generate the undeniable signature σ and sets σ ∗f = σ. Otherwise, this oracle chooses a random element σ ∗f in the signature space S. Then, this oracle returns the challenging signature σ ∗f to D. – Phase 2: On receiving the challenging signature, the distinguisher D still can access the oracles adaptively except that 1. pkp can not be submitted to the Corruption Oracle or UCon Oracle. 2. (m∗ , Wf , σ ∗ , pkp ) can not be submitted to the ICon Oracle. 3. (m∗ , Wf , pko , pkp ) cannot be submitted to the UPSign Oracle. 4. (m∗ , Wf , σ ∗f , pko , pkp ) cannot be submitted to the UPVer Oracle. – Guessing: Finally, the distinguisher D outputs a guess γ 0004 . The adversary wins the game if γ = γ 0004 . The advantage of an adaptively chosen message and chosen public key distinCMA, CP KA = Pr[γ = guisher D has in the above game is defined as Adv DIN V, CUP S 0004 γ ] − 1/2 . Definition 2. We say a convertible undeniable proxy signature scheme is invisiCMA, CP KA ble against a (t, qH , qKG , qSS , qUP S , qV , qIC , qUC , qC ) distinguisher DIN V, CUP S , CMA, CP KA if DIN V, CUP S runs in time at most t, makes at most qH queries to the random oracles, qKG queries to KGen Oracle, qSS queries to SSign Oracle, qUP S queries to UPSign Oracle, qV queries to UPVer Oracle, qIC queries to the ICon Oracle, qUC queries CMA, CP KA to the UCon Oracle, qC queries to the Corruption Oracle and Adv DIN V, CUP S is negligible. Remark: Anonymity is another security requirement of undeniable signatures. It has been shown in [14] that the property Invisibility and Anonymity are closely related in the notion of convertible undeniable signatures. 3.7
Soundness
Basically, soundness means that even the proxy signer himself can not convince a verifier V that a valid (invalid) signature is invalid (valid) without corrupting V 0004 s secret key. It is defined by the games against an adversary S0002 who can adaptively access to all the Oracles defined in Section 3.2. After all the queries, We say S0002 wins the game if S0002 can output (m∗ , Wf , σ ∗f , T rans∗ , pko , pkp , pkv ) with the restrictions that
Convertible Undeniable Proxy Signatures
23
1. S0002 has never queried pkv to the Corruption Oracle. 2. σ ∗f is not a valid undeniable proxy signature of message m∗ under the warrant W and the public keys pko , pkp and T rans∗ is a transcripts output by CProtocol. Or, σ ∗f is a valid undeniable proxy signature of message m∗ under the warrant W and the public keys pko , pkp and T rans∗ is a transcript output by DProtocol. The success probability of an adaptively chosen message and chosen public key CMA, CP KA adversary S0002 wins the above game is defined as Succ S0002Sound, CUP S . Definition 3. We say a convertible undeniable proxy signature scheme satisfies the property of soundness against a (t, qH , qKG , qSS , qUP S , qV , qIC , qUC , qC ) adCMA, CP KA 0002CMA, CP KA versary S0002Sound, CUP S , if SSound, CUP S runs in time at most t, makes at most qH queries to the random oracles, qKG queries to KGen Oracle, qSS queries to the SSign Oracle, qUP S queries to UPSign Oracle, qV queries to UPVer Oracle, qIC queries to the ICon Oracle, qUC to the UCon Oracle, qC queries to the Corruption CMA, CP KA Oracle and Succ S0002Sound, CUP S is negligible.
4
Our Proposed Scheme
In this section we will describe our convertible undeniable proxy signature scheme with security analysis. 4.1
Concrete Scheme
CPGen: Let (G1 , GT ) be bilinear groups where G1 = GT = q, for some prime number q ≥ 2k , k be the system security number and g be the generator of G1 . e denotes the bilinear map G1 × G1 → GT . Let h0 , h1 , h2 , h3 : {0, 1}∗ → G∗1 , h4 : {0, 1}∗ → ZZq be five secure cryptographic hash functions. KGen: The original signer O picks xo , yo ∈R ZZ∗q and sets the secret key sko = (xo , yo ). Then O computes the public key pko = (Xo , Yo ) = (g xo , g yo ). The proxy signer P picks xp , yp ∈R ZZ∗q and sets the secret key skp = (xp , yp ). Then P computes the public key pkp = (Xp , Yp ) = (g xp , g yp ). Similarly, the verifier V ’s secret/public key-pair is (skv , pkv ) = (xv , yv , Xv , Yv ) where xv , yv are randomly chosen in ZZ∗q . SSign: Let m be the message to be signed by the signer whose secret/public key pair is (x, y, X, Y ) = (x, y, g x , g y ). The standard signature is generated as: σs = h0 (m)x . SVer: Given the message m and the standard signature σs , anyone can verify ? whether e(σs , g) = e(h0 (m), X). If the equality holds, outputs 1. Otherwise, the result is 0. DGen: Let W be the warrant to be signed by the original signer O who wants to delegate his signing rights to the proxy signer P . O runs the algorithm SSign to generate the delegation σw = h0 (W )xo . Then O sends the warrant W and the delegation σw to P .
24
W. Wu et al.
DPVer: Given the warrant W and the delegation σw , the proxy signer P runs the algorithm SVer to check its validity. UPSign: For a message m to be signed, let ϑ = m0006pko 0006pkp 0006W . P chooses a random number t in ZZ∗q and computes σ1 = σw h1 (ϑ)xp yp h2 (ϑ)yp h3 (m W σ2 )t = h0 (W )xo h1 (ϑ)xp yp h2 (ϑ)yp h3 (m W σ2 )t and σ2 = g t . The signature is generated as σ = (σ1 , σ2 ). UPVer: For a signature σ = (σ1 , σ2 ) on the message m and warrant W , let ϑ = m0006pko 0006pkp 0006W . The proxy signer uses his secret key xp , yp to check ?
if e(σ1 , g) = e(h0 (W ), Xo )e(h1 (ϑ), g xp yp )e(h2 (ϑ), Yp )e(h3 (m W σ2 ), σ2 ). If the equality holds, output 1. Otherwise, outputs 0. CProtocol: Given the verifier V ’s public key pkv = (Xv , Yv ), a warrant W ∈ {0, 1}∗, a message m ∈ {0, 1}∗, and a valid signature σ = (σ1 , σ2 ) to be confirmed, let ϑ = m0006pko 0006pkp 0006W . The proxy signer P will use the designated verifier techniques [15] to prove its validity. – The proxy signer P chooses cv , dv , r ∈R ZZ∗q and computes: 1. A = g r , B = e(h1 (ϑ), Xp )r , C = g dv Yvcv . 2. h = h4 (m0006W 0006σ0006pkv 0006A0006B0006C), cs = h−cv (mod q) and ds = r−cs yp (mod q). P then sends (cs , cv , ds , dv ) to the verifier V . – On receiving (cs , cv , ds , dv ), the verifier V computes A0004 = g ds Ypcs , B 0004 = e(h1 (ϑ), Xp )ds [e(σ1 , g)/(e(h0 (W ), Xo )e(h2 (ϑ), Yp )e(h3 (m W σ2 ), ?
σ2 ))]cs , C 0004 = g dv Yvcv . Then, V checks whether cs +cv = h4 (m0006W 0006σ0006pkv 0006 A0004 0006B 0004 0006C 0004 ). If the equality holds, V will accept σ as a valid undeniable proxy signature. DProtocol: Given the verifier V ’s public key pkv = (Xv , Yv ), a warrant W ∈ {0, 1}∗, a message m ∈ {0, 1}∗, and a signature σ = (σ1 , σ2 ) to be disavowed, let ϑ = m0006pko 0006pkp 0006W . The proxy signer P will use the designated verifier techniques [15] to prove its invalidity. – The proxy signer P chooses cv , dv , r, α, β ∈R ZZ∗q and computes: 1. Z = e(σ1 , g)/(e(h0 (W ), Xo )e(h2 (ϑ), Yp )e(h3 (m W σ2 ), σ2 )). 2. A = [e(h1 (ϑ), Xp )yp /Z]r , B = e(h1 (ϑ), Xp )α /Z β , C = g α /Ypβ , D = g dv Yvcv . 3. h = h4 (m0006W 0006σ0006pkv 0006A0006B0006C0006D) and cs = h − cv (mod q). 4. ds = α + cs yp r (mod q) and d0003s = β + cs r (mod q). The proxy signer P then sends (A, cs , cv , ds , d0003s , dv ) to the verifier V . – On receiving (A, cs , cv , ds , d0003s , dv ), V computes 1. Z = e(σ1 , g)/(e(h0 (W ), Xo )e(h2 (ϑ), Yp )e(h3 (m W σ2 ), σ2 )). 0004 0004 2. B 0004 = e(h1 (ϑ), Xp )ds /(Z ds · Acs ), C 0004 = g ds /Ypds , D0004 = g dv Yvcv . If A 0004= 1 and cs + cv = h4 (m0006W 0006σ0006pkv 0006A0006B 0004 0006C 0004 0006D0004 ), V will believe σ is not a valid undeniable proxy signature. ICon: When the proxy signer wants to make his undeniable proxy signature tuple (m, W, σ) publicly verifiable, he first runs the algorithm UPVer:
Convertible Undeniable Proxy Signatures
25
– If the algorithm UPVer outputs 1, the proxy signer generates the in(m,σ) dividual proof Π(pkp ,W ) as follows: it first chooses r ∈R ZZ∗q and computes A = g r , B = e(h1 (ϑ), Xp )r , where ϑ = m0006pko 0006pkp 0006W , cs = h4 (m0006W 0006σ0006A0006B), ds = r − cs yp (mod q). Then, the proxy signer sets (m,σ) Π(pkp ,W ) = (cs , ds ). – Otherwise, the algorithm UPVer outputs 0. The proxy signer P chooses r, α, β ∈R ZZ∗q and computes: Z = e(σ1 , g)/(e(h0 (W ), Xo )e(h2 (ϑ), Yp )e(h3 (m W σ2 ), σ2 )), A = [e(h1 (ϑ), Xp )yp /Z]r , B = e(h1 (ϑ), Xp )α /Z β , C = g α /Ypβ , cs = h4 (m0006W 0006σ0006pkv 0006A0006B0006C), ds = α + cs yp r (mod q) and d0003s = β + cs r (m,σ) (mod q). P then sets Π(pkp ,W ) = (A, cs , ds , d0003s ). IVer: For an undeniable proxy signature tuple (m, W, σ) and the individual proof (m,σ) Π(pkp ,W ) , – If Π(pkp ,W ) has the form (cs , ds ), the verifier V computes A0004 = g ds Ypcs , B 0004 = e(h1 (ϑ), Xp )ds [e(σ1 , g)/(e(h0 (W ), Xo )e(h2 (ϑ), Yp )e(h3 (m W σ2 ), (m,σ)
?
σ2 ))]cs . Then, V checks whether cs = h4 (m0006W 0006σ0006pkv 0006A0004 0006B 0004 ). If the equality holds, V will accept σ as a valid undeniable proxy signature. (m,σ) – Otherwise, Π(pkp ,W ) = (A, cs , ds , d0003s ). V computes Z = e(σ1 , g)/(e(h0 (W ), Xo )e(h2 (ϑ), Yp )e(h3 (m W σ2 ), σ2 )), 0004
0004
B 0004 = e(h1 (ϑ), Xp )ds /(Z ds · Acs ), C 0004 = g ds /Ypds . If A 0004= 1 and cs = h4 (m0006W 0006σ0006pkv 0006A0006B 0004 0006C 0004 ), V will believe σ is not a valid undeniable proxy signature. UCon: When the proxy signer wants to make all his undeniable proxy signatures publicly verifiable, he computes Πpkp = g xp yp and publishes Πpkp . UVer: For any undeniable proxy signature tuple (m, W, σ) and the universal proof Πpkp , let ϑ = m0006pko 0006pkp 0006W . ?
– anyone can check whether e(g, Πpkp ) = e(Xp , Yp ). If this equality holds, one continues to compute – Z = e(σ1 , g)/(e(h0 (W ), Xo )e(h2 (ϑ), Yp )e(h3 (m W σ2 ), σ2 )) and check ?
Z = e(h1 (ϑ), Πpkp ). If this equality holds as well, one can accept σ as a valid undeniable proxy signature. Otherwise, it is invalid. 4.2
Security Analysis of the Proposed Scheme
In this section, we will give a formal security analysis of our proposed schemes. Theorem 1. The proposed scheme satisfies the property Completeness and Non-Transferability.
26
W. Wu et al.
Proof. It is easy to check our scheme satisfies the definition of the Completeness in Section 3.3. We will show our protocols satisfy the property of NonTransferability. Given any undeniable proxy signature tuple (m, W, σ) (either valid or invalid) and two public keys pko , pkp , the verifier’s secret key skv = (xv , yv ), let ϑ = m0006pko 0006pkp 0006W . The algorithm A can output the transcripts of the CProtocol and DProtocol as follows. 1. To simulate a transcript of the confirmation protocol which is designated to V , A randomly chooses cs , ds ∈ ZZq , r ∈ ZZ∗q and computes Z = e(σ1 , g)/(e(h2 (ϑ), Yp )e(h0 (W ), Xo )e(h3 (m W σ2 ), σ2 )), A = g ds Ypcs , B = e(h1 (ϑ), Xp )ds Z cs and C = g r . Then A computes cv = h4 (m0006W 0006σ0006pkv 0006A0006B0006C) − cs (mod q) and dv = r − cv yv (mod q). At last, A outputs (cs , cv , ds , dv ) as the transcript of the confirmation protocol. 2. To simulate a transcript of the disavowal protocol which is designated to V , A chooses a random element A ∈ GT such that A 0004= 1. A continues to choose four random elements cs , ds , d0003s ∈ ZZq , r ∈ ZZ∗q and computes Z = e(σ1 , g)/(e(h2 (ϑ), Yp )e(h0 (W ), Xo )e(h3 (m W σ2 ), σ2 )), 0004
0004
B = e(h1 (ϑ), Xp )ds /(Z ds ·Acs ), C = g ds /Ypds and D = g r . Then V computes cv = h4 (m0006W 0006σ0006pkv 0006A0006B0006C0006D) − cs (mod q) and dv = r − cv yv (mod q). At last, A outputs (A, cs , cv , ds , d0003s , dv ) as the transcript of the disavowal protocol. Due to the random numbers chosen by A, the transcripts generated by A are indistinguishable from those generated by the proxy signer P . Theorem 2. If there exists a (t, qH , qKG ,qSS , qUP S , qV , qIC , qUC , qC ) forger CMA, CP KA FEUF, CUP S can win the game defined in Section 3.5 with non-negligible sucCMA, CP KA cess probability Succ FEUF, CUP S , then there exists an algorithm A who can use F to solve a random instance of Computational Diffie-Hellman problem 4 2 2 qSS +qU P S +2 with probability SuccCDH A,G1 ≥ (qSS +qU P S )2 (1 − qU P S +qSS +2 ) qKG (1 − 1 qC qKG ) Succ
CMA,CP KA FEUF,CUP S in polynomial time.
Proof. Due to the page limitation, please refer to the full version of this paper. Theorem 3. If there exists a (t, qH , qKG , qSS , qUP S , qV , qIC , qUC , qC ) distinCMA, CP KA CMA, CP KA guisher DIN V, CUP S who can have non-negligible advantage Adv DIN V, CUP S in the game defined in Section 3.6, then there exists an algorithm A who can use D to solve a random instance of 3-Diffie-Hellman prob= qKG (q1U P S )2 (1 − qU P1S +1 )2(qU P S +1) (1 − lem with advantage Adv A3−DDH G1 1 qU C +qC qKG )
CMA,CP KA Adv DIN in polynomial time. V,CUP S
Convertible Undeniable Proxy Signatures
27
Proof. Due to the page limitation, please refer to the full version of this paper. Theorem 4. If there exists a (t, qH , qKG , qSS , qUP S , qV , qIC , qUC , qC ) adversary CMA, CP KA S0002Sound, CUP S can win the game defined in Section 3.7 with non-negligible success probability Succ S0002CMA, CP KA , then there exists an algorithm A who can use Sound, CUP S
S0002 to solve a random instance of Discrete Logarithm problem with probability 1 1 qC 0002CMA,CP KA SuccDL A,G1 ≥ qKG (1 − qKG ) Succ SSound, CUP S in polynomial time. Proof. Due to the page limitation, please refer to the full version of this paper.
5
Conclusion
In this paper, we proposed the first convertible undeniable proxy signature scheme with formal security analysis. The signature for a message consists of two elements in the group G1 , which is only around 340 bits when appropriate elliptic curve is used. We provided a formal security analysis to show that our scheme satisfies the properties of Unforgeability, Invisibility and Soundness in the random oracle models.
Acknowledgement The authors would like to thank the anonymous referees of the 8th International Workshop on Information Security Applications (WISA 2007) for the suggestions to improve this paper.
References 1. Boyar, J., Chaum, D., Damg˚ ard, I.B., Pedersen, T.P.: Convertible Undeniable Signatures. In: Menezes, A.J., Vanstone, S.A. (eds.) CRYPTO 1990. LNCS, vol. 537, pp. 189–205. Springer, Heidelberg (1991) 2. Boneh, D., Lynn, B., Shacham, H.: Short Signatures from the Weil Pairing. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 514–532. Springer, Heidelberg (2001) 3. Boyar, J., Chaum, D., Damgard, I.B., Pedersen, T.P.: Convertible Undeniable Signatures. In: Menezes, A.J., Vanstone, S.A. (eds.) CRYPTO 1990. LNCS, vol. 537, pp. 189–205. Springer, Heidelberg (1991) 4. Boldyreva, A., Palacio, A., Warinschi, B.: Secure Proxy Signature Schemes for Delegation of Digning Rights, http://eprint.iacr.org/2003/096 5. Chaum, D., Antwerpen, H.v.: Undeniable Signatures. In: Brassard, G. (ed.) CRYPTO 1989. LNCS, vol. 435, pp. 212–216. Springer, Heidelberg (1990) 6. Damg˚ ard, I.B., Pedersen, T.P.: New Convertible Undeniable Signature Schemes. In: Maurer, U.M. (ed.) EUROCRYPT 1996. LNCS, vol. 1070, pp. 372–386. Springer, Heidelberg (1996) 7. Galbraith, S.D., Mao, W., Paterson, K.G.: RSA-Based Undeniable Signatures for General Moduli. In: Preneel, B. (ed.) CT-RSA 2002. LNCS, vol. 2271, pp. 200–217. Springer, Heidelberg (2002)
28
W. Wu et al.
8. Galbraith, S.D., Mao, W.: Invisibility and Anonymity of Undeniable and Confirmer Signatures. In: Joye, M. (ed.) CT-RSA 2003. LNCS, vol. 2612, pp. 80–97. Springer, Heidelberg (2003) 9. Gennaro, R., Krawczyk, H., Rabin, T.: RSA-Based Undeniable Signatures. In: Kaliski Jr., B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 132–149. Springer, Heidelberg (1997) 10. Goldwasser, S., Micali, S., Rivest, R.: A Digital signature scheme secure against adaptively chosen message attacks. SIAM Journal on Computing 17(2), 281–308 (1988) 11. Gennaro, R., Rabin, T., Krawczyk, H.: RSA-Based Undeniable Signatures. Journal of Cryptology 13(4), 397–416 (2000) 12. Huang, X., Mu, Y., Susilo, W., Zhang, F., Chen, X.: A short proxy signature scheme: Efficient authentication in the ubiquitous world. In: Enokido, T., Yan, L., Xiao, B., Kim, D., Dai, Y., Yang, L.T. (eds.) Embedded and Ubiquitous Computing – EUC 2005 Workshops. LNCS, vol. 3823, pp. 480–489. Springer, Heidelberg (2005) 13. Huang, X., Susilo, W., Mu, Y., Wu, W.: Proxy Signature without Random Oracles. In: Cao, J., Stojmenovic, I., Jia, X., Das, S.K. (eds.) MSN 2006. LNCS, vol. 4325, pp. 473–484. Springer, Heidelberg (2006) 14. Huang, X., Mu, Y., Susilo, W., Wu, W.: Provably Secure Pairing-based Convertible Undeniable Signature with Short Signature Length. In: International Conference on Pairing-based Cryptography (Pairing 2007), Tokyo, Japan, July 2-4, 2007. LNCS, Springer, Heidelberg (to appear, 2007) 15. Jakobsson, M., Sako, K., Impagliazzo, R.: Designated Verifier Proofs and Their Applications. In: Maurer, U.M. (ed.) EUROCRYPT 1996. LNCS, vol. 1070, pp. 143–154. Springer, Heidelberg (1996) 16. Kim, S., Park, S., Won, D.: Proxy Signatures, Revisited. In: Han, Y., Quing, S. (eds.) ICICS 1997. LNCS, vol. 1334, pp. 223–232. Springer, Heidelberg (1997) 17. Lee, J.-Y., Cheon, J.H., Kim, S.: An analysis of proxy signatures: Is a secure channel necessary? In: Joye, M. (ed.) CT-RSA 2003. LNCS, vol. 2612, pp. 68–79. Springer, Heidelberg (2003) 18. Lee, B., Kim, H., Kim, K.: Secure mobile agent using strong nondesignated proxy signature. In: Varadharajan, V., Mu, Y. (eds.) ACISP 2001. LNCS, vol. 2119, pp. 474–486. Springer, Heidelberg (2001) 19. Libert, B.: Jean-Jacques Quisquater. In: Okamoto, T. (ed.) CT-RSA 2004. LNCS, vol. 2964, pp. 112–125. Springer, Heidelberg (2004) 20. Lyuu, Y.-D., Wu, M.-L.: Convertible Group Undeniable Signatures. In: Lee, P.J., Lim, C.H. (eds.) ICISC 2002. LNCS, vol. 2587, pp. 48–61. Springer, Heidelberg (2003) 21. Laguillaumie, F., Vergnaud, D.: Time-Selective Convertible Undeniable Signatures. In: Menezes, A.J. (ed.) CT-RSA 2005. LNCS, vol. 3376, pp. 154–171. Springer, Heidelberg (2005) 22. Mambo, M., Usuda, K., Okamoto, E.: Proxy signature: Delegation of the power to sign messages. IEICE Trans. Fundamentals E79-A(9), 1338–1353 (1996) 23. Miyazaki, T.: An Improved Scheme of the Gennaro-Krawczyk-Rabin Undeniable Signature System Based on RSA. In: Won, D. (ed.) ICISC 2000. LNCS, vol. 2015, pp. 135–149. Springer, Heidelberg (2001) 24. Michels, M., Petersen, H., Horster, P.: Breaking and Repairing a Convertible Undeniable Signature Scheme. In: Third ACM Conference on Computer and Communications Security, pp. 148–152. ACM Press, New York (1996)
Convertible Undeniable Proxy Signatures
29
25. Monnerat, J., Vaudenay, S.: Undeniable Signatures Based on Characters: How to Sign with One Bit. In: Bao, F., Deng, R., Zhou, J. (eds.) PKC 2004. LNCS, vol. 2947, pp. 69–85. Springer, Heidelberg (2004) 26. Monnerat, J., Vaudenay, S.: Short 2-Move Undeniable Signatures. In: International Conference on Cryptology in Vietnam 2006. LNCS, Springer, Heidelberg (to appear) 27. Ogata, W., Kurosawa, K., Heng, S.-H.: The Security of the FDH Variant of Chaums Undeniable Signature Scheme. In: Vaudenay, S. (ed.) PKC 2005. LNCS, vol. 3386, pp. 328–345. Springer, Heidelberg (2005) 28. Park, H.-U., Lee, I.-Y.: A digital nominative proxy signature scheme for mobile communications. In: Qing, S., Okamoto, T., Zhou, J. (eds.) ICICS 2001. LNCS, vol. 2229, pp. 451–455. Springer, Heidelberg (2001) 29. Pointcheval, D., Stern, J.: Security arguments for digital signatures and blind signatures. Journal of Cryptology 13(3), 361–396 (2000) 30. Shamir, A.: Identity-based Cryptosystems and Signature Schemes. In: Blakely, G.R., Chaum, D. (eds.) CRYPTO 1984. LNCS, vol. 196, pp. 47–53. Springer, Heidelberg (1985) 31. Wang, G., Qing, S., Wang, M., Zhou, Z.: Threshold Undeniable RSA Signature Scheme. In: Qing, S., Okamoto, T., Zhou, J. (eds.) ICICS 2001. LNCS, vol. 2229, pp. 221–232. Springer, Heidelberg (2001) 32. Zhang, F., Safavi-Naini, R., Susilo, W.: Attack on Han et al.’s ID-based Confirmer (Undeniable) Signature at ACM-EC 2003, http://eprint.iacr.org/2003/129
Secret Signatures: How to Achieve Business Privacy Efficiently? Byoungcheon Lee1,0002 , Kim-Kwang Raymond Choo2,00020002 , Jeongmo Yang1 , and Seungjae Yoo1 1
Department of Information Security, Joongbu University 101 Daehak-Ro, Chubu-Myeon, Geumsan-Gun, Chungnam, 312-702, Korea {sultan,jmyang,sjyoog}@joongbu.ac.kr 2 Australian Institute of Criminology GPO Box 2944, Canberra ACT 2601, Australia [email protected]
Abstract. Digital signatures provide authentication and non-repudiation in a public way in the sense that anyone can verify the validity of a digital signature using the corresponding public key. In this paper, we consider the issues of (1) signature privacy and (2) the corresponding public provability of signature. We propose a new digital signature variant, secret signature, which provides authentication and non-repudiation to the designated receiver only. If required, the correctness of the secret signature can be proven to the public either by the signer or the receiver. We conclude with a discussion to demonstrate the usefulness of the proposed cryptographic primitive (e.g., achieving signature privacy in an efficient manner). Keywords: Secret signature, signature privacy, public provability, key agreement, anonymity, public auction.
1
Introduction
Digital signature, first proposed by Diffie and Hellman in 1976 [11], is an electronic version of handwritten signatures for digital documents. A digital signature on some message, m, is generated by a signer, A, using a secret signing key, skA . The correctness of the generated signature is verified using the corresponding public key, pkA . It provides authentication and non-repudiation in a public way, in the sense that anyone can verify the validity of the digital signature, since pkA is public information. 0002
00020002
This work was supported by Korea Research Foundation Grant funded by Korea Government (MOEHRD, Basic Research Promotion Fund), grant No. KRF-2005003-D00375. The views and opinions expressed in this paper are those of the author and do not reflect those of the Australian Government or the Australian Institute of Criminology. This research was not undertaken as part of the author’s work at the Australian Institute of Criminology.
S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 30–47, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
Secret Signatures: How to Achieve Business Privacy Efficiently?
31
Signature Privacy. In this paper, we consider a business scenario where both the sender (signer) and the receiver (verifier) wish to keep their exchanged signatures private – we term this signature privacy. Such a (signature privacy) property does not appear to be easily achieved using general digital signature schemes. A ‘straightforward’ approach would be to encrypt a digital signature with the receiver’s public key (or with an agreed key) so that only the legitimate receiver can decrypt and retrieve the original signature. This is the so-called sign-then-encrypt approach, which is widely adopted in the real world. In order to implement the signature and encryption operations in a more efficient manner, signcryption [26] was proposed in 1997 by Zheng. Alternative solutions to achieve signature privacy include using special signature schemes that limit the verifiability of the signature only to a designated entity (e.g., the designated verifier signature (DVS) [6,15] and the limited verifier signature (LVS) [1,9]). Public Provability of Signature. Assume that we use a signature scheme designed to provide signature privacy. In the event that a dispute arises between a signer and a receiver during a business transaction, any chosen third party (e.g., a judge or the public) should be able to prove the correctness of the digital signature. We term such a property the public provability of a signature. This property can be easily achieved using either general digital signature schemes (i.e., by verifying the signature using the signer’s public key) or the sign-thenencrypt schemes (i.e., by decrypting the encrypted signature and verifying the retrieved signature in a standard way). In the latter sign-then-encrypt approach, proving the correctness of decryption is a computationally expensive operation (e.g., include zero-knowledge proofs). If we use signature schemes designed to provide signature privacy, then the public provability of generated signatures becomes an important requirement to ensure the fairness of business transactions. Although signcryption schemes appear to provide such a public provability feature, the original authors have not specified (this feature) explicitly. DVS and LVS schemes. DVS and LVS schemes, on the other hand, are unable to provide the public provability feature, since the designated verifier cannot transfer his conviction to others as the designated verifier is able to open the signature in any way of his choice using the knowledge of his/her private key. Our approach. In this paper, we introduce a new digital signature variant, secret signature (SS), designed to provide signature privacy. Advantages of our proposed SS scheme include providing all the following properties efficiently: 1. Authenticity and non-repudiation; 2. Signature privacy; and 3. Public provability of signature. More specifically, in our proposed SS scheme: – A signer, A, computes a signature using A’s private key together with the public key of a receiver, B. – B can then verify the signature using B’s private key and A’s public key.
32
B. Lee et al.
– Given a (signed) message, no one other than the designated receiver can identify the message’s authorship (e.g., what message is signed by whom and addressed to whom). – If required, either A or B can provide a proof of validity of the signature. Given the proof of validity, any third party will be able to verify the validity of the associated signature. To obtain the above-mentioned functionalities, we combine a secure signature scheme with a non-interactive one-way key agreement scheme between a signer and a receiver. In other words, our proposed SS can be viewed as a signature on a message and a secret agreed key. The designated receiver can recover the secret agreed key and verify the signature, but no third party can identify what message is signed by whom and addressed to whom. Secret signature is useful in many real-world applications where 1. the privacy of the generated signature needs to be maintained, but the authorship of the signature can be publicly proven at a later stage, 2. message confidentiality is not required, and 3. efficiency is critical. A main distinction between our proposed SS scheme and general signcryption schemes1 is that SS scheme provides signature privacy without using encryption. Thus it is more efficient than signcryption in many applications where message confidentiality is not required. We describe some application examples in Section 7. Outline. We define the proposed SS scheme and provide security definitions in Section 2. In the following section, a concrete construction example of SS in discrete logarithm (DL)-based cryptosystems and the security proofs in the random oracle model are presented. We then present a brief discussion on how to prove the validity of SS in Section 4. We compare the features of SS with previously published signature privacy-related schemes in Section 5 and compare the efficiency of SS with signcryption in Section 6. Several possible applications are presented in Section 7. Section 8 concludes this paper.
2
Definitions
2.1
Definition of Secret Signature Scheme
There are two entities in the SS scheme, namely, a signer (sender), A, and a verifier (receiver), B. The formal definition of SS scheme is as follows. Definition 1 (Secret Signature Scheme). A secret signature (SS) scheme consists of the following six algorithms. 1
In this paper, we do not consider various features provided by different variants of the signcryption scheme (e.g., [10] and [17]).
Secret Signatures: How to Achieve Business Privacy Efficiently?
33
1. Setup : params ← SS.SetU p(1k ). A probabilistic algorithm, SS.SetU p, which takes a security parameter, k, as input and outputs the public parameters, params. 2. Key Generation : (pk, sk) ← SS.KeyGen(params). A probabilistic algorithm, SS.KeyGen, which takes the public parameters, params, as input and outputs a pair (pk, sk) of matching public and private keys. For example, the public/private key pairs of the signer and receiver, (pkS , skS ) and (pkR , skR ), are generated using SS.KeyGen. 3. Signing : (V, seed) ← SS.Sign(params, m, skS , pkR ). A probabilistic algorithm, SS.Sign, run by the signer, which takes as input the public parameters, params, a plaintext message, m ∈ {0, 1}∗, signer’s private key, skS , and receiver’s public key, pkR ; and outputs a secret signature, V and the random seed which was used to compute the signature. Signer has to keep seed secretly by himself. 4. Verification : result ← SS.V erif y(params, V, m, pkS , skR ). A deterministic algorithm, SS.V erif y, run by the receiver, which takes as input the public parameters, params, a secret signature, V , a plaintext message, m, the signer’s public key, pkS , and receiver’s private key, skR , and outputs result. If V is a valid secret signature, then result = valid, otherwise, result = invalid. If correct signature V = SS.Sign(params, m, skS , pkR ) is tested, the verification result SS.V erif y(params, V, m, pkS , skR ) 0004→ result should be valid. 5. Public Proving A probabilistic algorithm that is run by either the signer or the receiver to prove the validity of the secret signature to public. Run by the signer : proofS ← SS.P roof.Signer(params, V, seed). SS.P roof.Signer which takes as input required parameters, params, the signer’s random seed used to compute the secret signature, and the secret signature, V , and outputs a proof, proofS . Run by the receiver : proofR ← SS.P roof.Receiver(params, V, skR ). SS.P roof.Receiver which takes as input required parameters, params, the secret signature, V , and the receiver’s private key, skR , and outputs a proof, proofR . 6. Public Verification : result ← SS.P ubV erif y(params, m, V, pkS , pkR , proof). A deterministic algorithm, SS.P ubV erif y, which takes as input the public parameters, params, a message m, a secret signature, V , the public keys of the signer and the intended receiver, and the validity proof proof (either proofS or proofR ), and outputs a verification result, result (either valid or invalid ). 2.2
Security Definitions
Informally we consider the following security requirements for the proposed SS scheme described in Definition 1. Correctness. If a secret signature is generated by following the protocol correctly, then the result of the verification always return valid.
34
B. Lee et al.
Unforgeability. Anyone except the signer can have a non-negligible advantage in forging a secret signature. Non-Repudiation. The signer is unable to repudiate the generation of a secret signature that the signer has previously generated. If unforgeability is provided, then non-repudiation is obtained consequently. Signature Privacy. The secret signature generated by the signer is verifiable only by the designated receiver. No other entity except the signer and the receiver is able to have a non-negligible advantage in distinguishing the secret signature. Signature privacy is defined in terms of invisibility. Public Provability. The validity of the signature can be proven to public by the signer or the verifier, if the need arises. To define the unforgeability and non-repudiation more formally, we recall the widely accepted security notions on digital signatures, unforgeability against chosen-message attacks. In order to provide non-repudiation, we would like to prevent the forgery of A’s signature without knowledge of A’s signing key, except with negligible probability. As shown in the seminal paper of Diffie and Hellman [11], the security of such a scheme in the public key setting typically depends on the existence of a one-way function. A formalized and widely accepted security notion for digital signature was introduced by Goldwasser, Micali, and Rivest [14], which they term existential unforgeability under adaptive chosen-message attack (EF-ACMA). However, in our proposed SS scheme, there are two inputs: the message to be signed and the intended recipient’s public key. Hence, we extend the standard security definition to the existential unforgeability under the adaptive chosenmessage chosen-receiver attack (EF-ACMCRA) in which the attacker is allowed to query secret signatures to the signing oracle for any chosen message and receiver’s public key adaptively. An unforgeability of secret signature can be defined in terms of the following unforgeability game. Game Unforgeability: Let F be a forger and k be a security parameter. 1. (Initialization) First, params ← SS.SetU p(1k ) is executed and the signer’s key pair (pkS , skS ) ← SS.KeyGen(params) is computed. pkS is given to F . 2. (Training) F is allowed to ask a series of SS.Sign queries for any combination of message m and receiver’s public key, pkR , chosen by F to the signing oracle. To do this, F computes (pkR , skR ) ← SS.KeyGen(params), and asks secret signature to the signing oracle by sending (m, pkR , skR ). Then the signing oracle provides valid secret signatures V . 0004 0004 , skR , V 0004 ) as a forgery of a secret signature 3. (Output) F outputs a pair (m0004 , pkR 0004 on message m from the signer S to a receiver R0004 . 0004 F wins the game if valid ← SS.V erif y(params, V 0004 , m0004 , pkS , skR ) and the tu0004 0004 0004 0004 ple (m , pkR , skR , V ) has never been queried to SS.Sign. In this definition of unforgeability we assume that receiver’s key pair is known to F and the signing oracle. Without the knowledge of receiver’s private key the signing oracle cannot
Secret Signatures: How to Achieve Business Privacy Efficiently?
35
simulate secret signature and F cannot verify the validity of received secret signature. Since the main concern of the unforgeability game is the unforgeability of signer’s signature, this assumption is reasonable. Definition 2. (Unforgeability) A secret signature scheme is said to be secure in the sense of existential unforgeability under the adaptive chosen-message chosenreceiver attack (EF-ACMCRA), if no probabilistic, polynomial-time (PPT) forger, F , can have a non-negligible advantage in Game Unforgeability. Signature privacy requires that a given secret signature is a private information between the signer and the receiver. Any other entity cannot distinguish a secret signature from a random string in a signature space. Signature privacy can be defined in terms of the following invisibility game. Game Invisibility: Let D be a distinguisher. First, params ← SS.SetU p(1k ) is executed and the signer’s key pair (pkS , skS ) ← SS.KeyGen(params) is computed. pkS is given to D. Let (pkR , skR ) ← SS.KeyGen(params) be the receiver’s key pair. pkR is given to D. At some point D outputs a message m0004 and requests for a challenge secret signature V 0004 to the challenger C. The challenge V 0004 is generated by C based on the outcome of a hidden coin toss b. If b = 1, V 0004 is generated by running SS.Sign. If b = 0, V 0004 is chosen randomly in the signature space. At the end of the game, D outputs a guess b0004 . D wins if b = b0004 and the tuple (m0004 , V 0004 ) has never been queried to SS.Sign. In this invisibility game, receiver’s private key skR is hidden from D, since D is not the designated receiver. The designated receiver can distinguish the secret signature using his private key. Definition 3. (Invisibility) A secret signature scheme is said to provide invisibility, if no probabilistic, polynomial-time distinguisher, D, can have a nonnegligible advantage in Game Invisibility. 2.3
General Implementation
The underlying intuition behind the general implementation of the proposed SS schemes is to combine a secure signature scheme and a non-interactive one-way key agreement. We denote the signer as A and the intended receiver as B. Key agreement. Assume that A wants to send a secret signature for a message m to B. Using a non-interactive one-way key agreement scheme A generates an agreed secret (session) key, K, using B’s public key, pkB . For example, in DL-based cryptosystems, A chooses a random seed rA and computes an rA = g xB rA = (g rA )xB . agreed key K = pkB Signing. A generates a signature, V , by signing m K with A’s signing key, skA . We can interpret V as a secret signature for the message m that is privately shared between A and B, since no one else should be able to verify V without knowledge of K.
36
B. Lee et al.
Verification. B is able to compute the shared key using his private key and, hence, verify the secret signature. Any entity other than the signer and intended receiver cannot compute K, thus cannot determine the validity of the signature, even cannot tell what message was signed by whom to which receiver. Note that signatures on the same message generated by the same signer for the same receiver will differ from one session to another due to the additional session key component in the generation of the secret signature. Also the session key K provides a binding between the signature and the recipient, thus the same signature cannot be used for other recipient. We remark that we mainly focus on the signature privacy, rather than the message confidentiality. Depending on the requirement of the applications, the nature in which the actual message is exchanged can be sent in clear, sent in ciphertext, or does not need to be sent (implicitly known to the receiver).
3
DL-Based Implementation of Secret Signature Scheme
The proposed SS scheme can be implemented using different public key cryptosystems (e.g., identity-based cryptosystems). In this section, we present an implementation example of the SS scheme in the discrete logarithm-based setting. 1: Setup We assume common system parameters (p, q, g) where p and q are large primes satisfying q p − 1 and g is an element of order q in Z∗q . We then require a secure cryptographic hash function, H : {0, 1} 0004→ Zq , which we will model as a random oracle [4]. For readability, we will omit the modulo p in our subsequent equations, if it is clear. Let ∈R denote uniform random selection. 2: Key Generation A signer A has a long-term certified key pair (xA , yA ), where xA ∈R Z∗q and yA = g xA . A receiver B has a long-term certified key pair (xB , yB ), where xB ∈R Z∗q and yB = g xB . 3: Signing Let m denote the message to be signed. The signer, A, selects a random seeds rA ∈R Z∗q . Using rA , A now computes U = g rA and the key to be rA shared with the verifier, W = yB . A computes V = rA + xA H(m, U, W ). A sends the secret signature, 0006m, U, V 0007, to the intended receiver, B. 4: Verification The receiver, B, uses his private key, xB , to compute the exchanged key ? H(m,U,W ) chosen by the signer, W = U xB . B then verifies V by g V = U · yA . If V verifies correctly, then B knows that the message is indeed signed by A. 5: Public Proving The validity of secret signature 0006m, U, V 0007 is proven to public either by the signer or the receiver. In this stage we consider the following two cases according to the information revealed.
Secret Signatures: How to Achieve Business Privacy Efficiently?
Signer A
37
Receiver B
xA ∈R Z∗q , yA = g xA rA ∈R Z∗q rA U = g rA ; W = y B
2. Key Generation
xB ∈R Z∗q , yB = g xB
3. Signing
0003m, U, V 0004 V = rA + xA H(m, U, W ) −−−−−−−−−−−−−−−−→ 4. Verification
W = U xB g
V
?
H(m,U,W )
= U · yA
5. Public Proving (1) Expose W (1) Expose W = U xB (2) Prove the validity of W (2) Prove the validity of W 6. Public Verification ?
H(m,U,W )
g V = U · yA
Fig. 1. DL-based implementation
(1) Message proving. If the receiver information needs not be exposed, just reveal W . With the additional information, W , anyone can verify that (U, V ) is a correct signature of the signer A for the message m and W . If only message proving is required, it is very efficient. (2) Receiver proving. If the receiver information needs to be proven, reveal W and prove its correctness with respect to the receiver’s public key. Its validity can be proven by the signer or the receiver either non-anonymously (using the general proof) or anonymously (using the anonymous proof) which will be described in Section 4. 6: Public Verification Given W which is proven to be correct, anyone can verify the validity of the ? H(m,U,W ) . secret signature by checking g V = U · yA We prove the security of our scheme assuming the intractability of the discrete log problem and also in the random oracle model (ROM). Theorem 1. The proposed SS scheme is EF-ACMCRA secure (in the sense of Definition 2) in the random oracle model under the assumption that the discrete logarithm problem is intractable.
38
B. Lee et al.
Proof Sketch. Since the proposed SS scheme is a ElGamal family signature scheme, Forking lemma [21,20] can be applied. We assume that there exists a forger F (described in Definition 2) that can forge a secret signature in time t(k) with a non-negligible advantage 0002(k). In the training stage F can ask signing queries for any combination of message and receiver pair to the signing oracle and receive correct secret signatures from the signing oracle. The challenger C controls all communication of F and simulate the signing queries. The signing algorithm uses a hash function which is modeled by an random oracle under the random oracle model. For each signing query (M, yR , xR ) given by F , C picks random integers a, b ∈R Zq∗ and computes b , W ← U xR , h ← −b, V = a. U ← g a yA
C gives h as a random oracle answer to the H(m, U, W ) query and (U, V ) as the signature for (M, yR , xR ) signing query. Then this simulated signatures can pass F ’s signature verification and are indistinguishable from the real signatures. The remaining proof is the same as the case for the original Schnorr signature. The discrete logarithm problem can be solved in time t0004 (k) with advantage 00020004 (k) where −0.5 t0004 (k) ≈ {2(t(k) + qH τ ) + OB (qS k 3 )}/0002(k), 00020004 (k) ≈ qH where qS and qH are the numbers of signing queries and hash oracle queries, respectively, and τ is time for answering a hash query. If a successful forking is found, signer’s private key xA can be computed, which contradicts the discrete logarithm assumption.
Theorem 2. The proposed SS scheme provides signature privacy (invisibility) in the sense of Definitions 3 under the random oracle model, if the decisional Diffie-Hellman (DDH) problem is intractable. The proof for Theorem 2 generally follows that of Galbraith and Mao [13]. We assumes that there exist an adversary D, who can gain a non-negligible advantage in distinguishing the signatures in the game outlined in Definition 3. We now construct another algorithm, DDDH , to break the decisional Diffie–Hellman (DDH) problem using D.
4
Proving the Validity of Secret Signature
In the public proving stage, the signer or the receiver prove the validity of secret signature to a judge or public. Once the proof is given, anyone can verify the validity of secret signature and non-repudiation is provided. Here we consider the following two cases. General Proof. In this protocol, the identity of the entity (signer or receiver) who proves the validity of the SS is revealed, since the signer’s proof and the receiver’s proof are distinguishable.
Secret Signatures: How to Achieve Business Privacy Efficiently?
39
Anonymous Proof. As the name suggests, the identity of the entity who proves the validity of the SS is not revealed, since the signer’s proof and the receiver’s proof are indistinguishable. However, this proof is computationally more expensive than that of the general proof. 4.1
General Proof Protocol
In this protocol, either the signer, A, or the receiver, B, reveals the shared key W and proves its validity using the proof outlined in Appendix A. – The signer A proves ZKP (rA ) : (logg U = logyB W = rA ) using his knowledge of rA . – The receiver B proves ZKP (xB ) : (logg yB = logU W = xB ) using his knowledge of xB . It is easy to see that the proofs initiated by the signer and the receiver are distinguishable, hence the identity of the entity who had exposed the secret signature is revealed. 4.2
Anonymous Proof Protocol
There might exist situations where we are unable to reveal the identity of the entity who exposed the secret signature, perhaps, due to privacy or legal restrictions. In such cases, we cannot employ the general proof protocol presented above. Here we show how the signer or the receiver can prove the validity of secret signature anonymously without revealing their identity. Note that the tuple 0006g, U, yB , W 0007 has the special relations depicted in Figure 2. The signer knows rA (= logg U = logyB W ) and the receiver knows xB (= logg yB = logU W ). The anonymous proof protocol can be initiated either by A or by B. They exrA = U xB = g rA xB and demonstrate their knowledge pose the shared key W = yB of the corresponding secret information as follows. ZKP (rA ∨ xB ) : (logg U = logyB W = rA ) ∨ (logg yB = logU W = xB ). It is a OR combination of two zero-knowledge proofs of the equality of two discrete logarithms described in Appendix B. A or B is able to prove the validity of W by using their knowledge of rA or xB .
g Receiver knows xB
⏐ ⏐ 0003
Signer knows rA − −−−−−−−−−−−−−−−→
U = g rA ⏐ ⏐ 0003
x x r yB = g xB − −−−−−−−−−−−−−−−→ W = U B = g B A
Fig. 2. Special relations of the tuple 0003g, U, yB , W 0004
40
B. Lee et al.
Although these two proofs by the signer and the receiver are computed differently, any public verifier is unable to distinguish whether the proof is provided by the signer or the receiver. If one of the party opens the secret signature anonymously, the other partner know that no one other than the partnering entity has opened the secret signature. However, the party is unable to prove that the other partnering entity has opened the secret signature. From the public’s perspective, the identity remains anonymous.
5
Comparison of Features
Several signature variants found in the literature also provide signature-privacyrelated functionalities. We now compare these schemes with our proposed SS scheme. Sign-then-encrypt approach: Although this approach provides signature privacy property, it has the following disadvantage. Once the receiver decrypt the encrypted signature and obtain the corresponding publicly verifiable signature, the message is no longer linkable to the receiver. Thus the receiver or any third party can use it for malicious purposes. On the other hand, this is not the case in our SS scheme as the (special) signature generated by a signer is given to a specific receiver. Undeniable signature [8] and designated confirmer signature [5]: In the former scheme, the recipient has to interact with the signer to be convinced of its validity whilst in the latter scheme, the signatures has to be verified by interacting with an entity, the confirmer, designated by the signer. Both signature schemes require an interactive protocol to carry out signature verification. In our proposed SS scheme only a simple and computationally cheap algorithm is required to carry out the signature verification. Nominative signature [16] scheme: This scheme allows a nominator (signer) and a nominee (verifier) to jointly generate and publish the signature in such a way that only the nominee can verify the signature and if necessary, only the nominee can prove to a third party that the signature is valid. Although signature privacy can be achieved using this scheme, it requires an interactive protocol for the signing stage. In our proposed SS scheme, a signer is able to generate the signature on his/her own. Designated Verifier Signature (DVS) scheme: This scheme, independently proposed by Jakobsson, Sako, and Impagliazzo [15]2 and Chaum [6] in 1996, provides signature privacy. Although the designated verifier can be convinced of the validity of the signature in the DVS scheme, the verifier is unable to transfer the conviction to any other entity. A major difference between our proposed SS scheme and the DVS scheme is that the latter is unable to provide public provability of the signature. Limited Verifier Signature (LVS) scheme: This scheme, first proposed by Araki, Uehara, and Imamura in 1999 [1], differs from the DVS scheme in that 2
Lipmaa, Wang, and Bao [19] pointed out a weakness in this DVS scheme [15].
Secret Signatures: How to Achieve Business Privacy Efficiently?
41
the limited verifier is able to transfer the proof to convince another entity (e.g., a judge) if the signer has violated some rule non-cryptographically. Such a proof is, however, not transferrable to a third entity. In 2004, Chen et. al. proposed a convertible limited verifier signature scheme in a pairing-based setting [9]. Their scheme allows the limited verifier signature to be transformed into a publicly verifiable signature. This converted limited verifier signature, however, is rather loosely related to the limited verifier any more since the limited verifier is unable to prove that he is the intended recipient of the converted signature. On the other hand in our proposed SS scheme, any receiver can prove publicly that he is the legitimate receiver of the corresponding signature. Anonymous signature scheme: First proposed by Yang et al. [25], this scheme appears to have similar property. However, the anonymous signature scheme provides signer anonymity and does not have an intended receiver when the signature is generated. We will provide an example in Section 7 to better explain this difference. Signcryption scheme [26]: This scheme, first proposed by Zheng in 1997, is perhaps most similar to our proposed SS scheme. The signcryption scheme is built from a clever combination of a secure encryption scheme and a secure signature scheme, providing both confidentiality and non-repudiation. Many extensions of the signcryption scheme have also been proposed (e.g., [2,3,18,22,23]3 ). Signcryption provides signature privacy by encrypting the message. However, this is computationally expensive particularly for applications that do not require message confidentiality. SS is a new approach to provide signature privacy without encryption. At first read, our proposed SS scheme might be confused with other previously published signature privacy-related signature schemes. However, if we refer to the definition of SS scheme given in Definition 1, it is clear that: – The undeniable signature scheme differs from the SS scheme since interactive protocol between the signer and the verifier is required in the signature verification algorithm. – DVS and LVS schemes differ from the SS scheme since public provability cannot be achieved. – Convertible LVS scheme differs from the SS scheme since the converted signature is not related with the receiver in any way. – The anonymous signature scheme differs from the SS scheme since the signature does not have an intended receiver when the signature is generated whilst in our proposed SS scheme, an intended receiver is required at the time of signature generation. On the other hand, the signcryption scheme is most similar to our proposed SS scheme if the public proving/verification algorithms are further defined. 3
The signcryption scheme of Libert and Quisquater [18] is shown to be insecure [24].
42
6
B. Lee et al.
Comparison of Efficiency
Since the signature-privacy-related signature schemes presented in Section 5 – with the exception of the signcryption scheme – have rather different functionalities, we will restrict our comparison only to the signcryption scheme. Since public proving/verification protocols were not defined in the original signcryption scheme, we assume that similar zero-knowledge proof techniques are applied in order to facilitate our comparison. Note that both the general and anonymous proofs are also possible in the signcryption scheme. For completeness, we now describe briefly how the general and anonymous proofs are possible in Zheng’s signcryption scheme. To prove the correctness of signcryption, either the signer or the receiver has to compute and reveal the following information (we use the same notation as the original paper). – The signer, A, has to keep the random number, x, used in the signcryption secret. A has to compute and reveal ybx and g x (requires 1 extra E). In this ?
case, the public verifier has to check whether g x = g rs yas holds (requires 2E). – The receiver, B, has to compute and reveal g x = g rs yas and ybx = (g x )xb (requires 3E). Now, using the general proof protocol, the signer can then prove ZKP (x) : (logg g x = logyb ybx = x) and the receiver can prove ZKP (xb ) : (logg yb = loggx ybx = xb ), which requires 2E for proof and 4E for verification. Anonymous proof is also possible for the 0006g, g x , yb , ybx 0007 tuple, which requires 6E for proof and 8E for verification. Some of the computations required by the signer in our scheme can be performed offline (i.e., before the message to be sent and the receiver are known), such as U = g rA , and hence, provides efficiency. In Table 1, we compare the efficiency of our scheme with Zheng’s signcryption scheme. The notation used in Table 1 is as follows: E and Eof f denotes online and offline modular exponentiations, respectively; and ED denotes the cost for symmetric encryption or Table 1. Summary of efficiency analysis
Signing Verification General proof Proof by signer Verification General proof Proof by receiver Verification Anonymous proof Proof by signer Verification Anonymous proof Proof by receiver Verification Public verification
Signcryption [26] 1E + 1ED 2E + 1ED 3E 6E 5E 4E 7E 10E 9E 8E 1ED
Proposed SS scheme 1E + 1Eof f 3E 2E 4E 2E 4E 6E 8E 6E 8E 2E
Secret Signatures: How to Achieve Business Privacy Efficiently?
43
decryption. For simplicity, we ignore modulo multiplication/division operations and hash operations in our comparison. Compared with Zheng’s signcryption, our proposed SS scheme is more efficient both in the actual signature scheme and the public proving. Since the SS scheme does not use symmetric encryption, it is more efficient than signcryption, especially with long message when message confidentiality is not required. In public proving, the same zero-knowledge proofs can be used. In signcryption, however, both the signer and receiver have to compute and reveal additional information. All required information is already included in the generated signature in the SS scheme. Therefore, the SS scheme is more efficient than signcryption when used in business transactions that do not require confidentiality of messages.
7
Applications of Secret Signatures
The proposed SS scheme can be used as an important cryptographic primitive to achieve business privacy, providing both signature privacy and public provability of signature efficiently. Secret signature is useful in many applications where (1) the privacy of generated signature needs to be maintained, but the authorship of the signature can be publicly proven at a later stage, (2) message confidentiality is not very important, and (3) efficiency is critical. We now describe some possible application examples. Application 1: Private Business Transaction Let’s assume that two business entities, A and B, wanting to exchange some not-so-confidential contract document, m (e.g., m can be constructed using open source information). Although the contents of m do not need to be confidential, both A and B do not want to reveal to other entities that they had signed m. By using the proposed SS scheme, both A and B are assured that no third party is aware that m has been signed. In the event that one of the entities violates a non-cryptographic business rule, the other entity can prove the validity of their private contract to a third party by opening the generated secret signature. The public provability property guarantees the fairness of private business. Application 2: Public Auction We consider an application is a public auction (or English auction) setting where bidding prices are published and the bidders are allowed to bid prices anonymously as frequently as desired. When the winner is finally decided, the anonymity of the winning bid is revealed and the correctness of the winning bid should be proven publicly (to provide public verifiability). This is a typical example where message confidentiality is not required, but signature privacy and public provability are required. To provide the anonymity of bid with public provability, we can use the proposed SS scheme. In the bidding stage, bidders bid their prices using a secret
44
B. Lee et al.
signature with the auctioneer as a receiver. For example, let A be the auctioneer, Bi be bidders, and pj be the bidding prices. Bidder Bi computes si,j = SS.Sign(params, pj , skBi , pkA ), ki,j = EpkA (Bi ), and posts 0006pj , si,j , ki,j 0007 on the bulletin board, where ki,j is an encrypted ID information of the bidder (can be decrypted only by the auctioneer). In the winner announcement stage, the auctioneer opens the highest price bid and proves the correctness of secret signature. Any misbehavior of the auctioneer or bidders can be proven publicly. Note that bidder anonymity was achieved easily without using any encryption. Also note that threshold cryptography can be used by the auctioneer such that pkA is distributed to multiple auctioneers and bidder information of the losing bids is kept secret. Application 3: Paper Submission System Consider a paper submission system for a conference. In this case submitted papers should be anonymous while they need not be encrypted. Authors need to commit the following facts to the program chair with a signature; (1) the submitted paper is their authentic work, (2) the submitted paper is not submitted in parallel to another conference or journal, (3) the authors will present the paper if accepted, and etc. Upon receiving the submission, the program chair has to issue a receipt for the submitted paper to the author. In such an application, the authors and the conference program chair can exchange secret signatures with the other entity as a receiver (or publish the secret signature on the bulletin board as a public commitment). If general signatures are exchanged, anonymity will be broken (since anyone can verify the authorship of the submitted paper using the respective public keys). In the unlikely event of a dispute between the program chair and an author at a later stage, it can be resolved easily by using the public proving feature of secret signature. One may note that the generated secret signature is tightly bound to the recipient, the program chair of the conference. Hence, the same signature cannot be submitted to another recipient, the program chair of another conference. Therefore, if it was subsequently discovered that the same paper with two different signatures was submitted to two different conferences in parallel, the author cannot repudiate his misbehavior. However, this is not the case for Yang et al.’s [25] anonymous signature scheme; the generated signature is not bound to any recipient. The signer is able to repudiate that someone other than the signer has forwarded the signature and the paper to the program chair of another conference without the signer’s knowledge.
8
Conclusion
We had discussed the signature privacy issue and introduced a new signature variant, which we termed secret signature (SS). The SS scheme provides signature privacy and public provability of signature in an efficient manner by
Secret Signatures: How to Achieve Business Privacy Efficiently?
45
combining secure signature schemes and non-interactive one-way key agreement schemes. Although this is a very simple concept, it is a useful and efficient cryptographic tool to achieve business privacy. We then presented a concrete implementation example of secret signature in discrete logarithm-based cryptosystems. Future extension of this work includes implementing the SS scheme in other cryptosystems such as RSA-based and pairing-based cryptosystems.
Acknowledgement The second author would like to thank Sherman SM Chow for pointing out references [10] and [17].
References 1. Araki, S., Uehara, S., Imamura, K.: The Limited Verifier Signature and its Applications. IEICE Transactions E82-A(1), 63–68 (1999) 2. Baek, J., Steinfeld, R., Zheng, Y.: One-time Verifier-based Encrypted Key Exchange. In: Naccache, D., Paillier, P. (eds.) PKC 2002. LNCS, vol. 2274, pp. 80–98. Springer, Heidelberg (2002) 3. Bao, F., Deng, R.H.: A Signcryption Scheme with Signature Directly Verifiable by Public Key. In: Imai, H., Zheng, Y. (eds.) PKC 1998. LNCS, vol. 1431, pp. 55–59. Springer, Heidelberg (1998) 4. Bellare, M., Rogaway, P.: Random Oracles Are Practical: A Paradigm For Designing Efficient Protocols. In: ACM CCS 1993, pp. 62–73. ACM Press, New York (1993) 5. Chaum, D.: Undeniable Signatures. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 86–91. Springer, Heidelberg (1995) 6. Chaum, D.: Private Signature and Proof Systems. United States Patent 5,493,614 (1996) 7. Chaum, D., Pedersen, T.P.: Wallet Databases with Observers. In: Brickell, E.F. (ed.) CRYPTO 1992. LNCS, vol. 740, pp. 89–105. Springer, Heidelberg (1993) 8. Chaum, D., van Antwerpen, H.: Undeniable Signatures. In: Brassard, G. (ed.) CRYPTO 1989. LNCS, vol. 435, pp. 212–216. Springer, Heidelberg (1990) 9. Chen, X., Zhang, F., Kim, K.: Limited Verifier Signature Scheme from Bilinear Pairings. In: Jakobsson, M., Yung, M., Zhou, J. (eds.) ACNS 2004. LNCS, vol. 3089, pp. 135–148. Springer, Heidelberg (2004) 10. Chow, S.S.M., Yiu, S.M., Hui, L.C.K., Chow, K.P.: Efficient Forward and Provably Secure ID-Based Signcryption Scheme with Public Verifiability and Public Ciphertext Authenticity. In: Lim, J.-I., Lee, D.-H. (eds.) ICISC 2003. LNCS, vol. 2971, pp. 352–369. Springer, Heidelberg (2004) 11. Diffie, W., Hellman, M.: Multiuser Cryptographic Techniques. In: AFIPS 1976 National Computer Conference, pp. 109–112. AFIPS Press (1976) 12. Fiat, A., Shamir, A.: How to Prove Yourself: Practical Solutions to Identification and Signature Problems. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, pp. 186–194. Springer, Heidelberg (1987) 13. Galbraith, S.D., Mao, W.: Invisibility and Anonymity of Undeniable and Confirmer Signatures. In: Joye, M. (ed.) CT-RSA 2003. LNCS, vol. 2612, pp. 80–97. Springer, Heidelberg (2003)
46
B. Lee et al.
14. Goldwasser, S., Micali, S., Rivest, R.L.: A Digital Signature Scheme Secure Against Adaptive Chosen-Message Attacks. SIAM Journal on Computing 17(2), 281–308 (1988) 15. Jakobsson, M., Sako, K., Impagliazzo, R.: Designated Verifier Proofs and Their Applications. In: Maurer, U.M. (ed.) EUROCRYPT 1996. LNCS, vol. 1070, pp. 321–331. Springer, Heidelberg (1996) 16. Kim, S.J., Park, S.J., Won, D.H.: Zero-Knowledge Nominative Signatures, pp. 380– 392 (1996) 17. Li, C.K., Yang, G., Wong, D.S., Deng, X., Chow, S.S.M.: An Efficient Signcryption Scheme with Key Privacy. In: EuroPKI 2007. LNCS, vol. 4582, pp. 78–93. Springer, Heidelberg (2007) 18. Libert, B., Quisquater, J.-J.: Efficient Signcryption with Key Privacy from Gap Diffie-Hellman Groups. In: Bao, F., Deng, R., Zhou, J. (eds.) PKC 2004. LNCS, vol. 2947, pp. 187–200. Springer, Heidelberg (2004) 19. Lipmaa, H., Wang, G., Bao, F.: Designated Verifier Signature Schemes- Attacks, New Security Notions and a New Construction. In: Welzl, E., Montanari, U., Rolim, J.D.P. (eds.) ICALP 2000. LNCS, vol. 1853, pp. 459–471. Springer, Heidelberg (2000) 20. Mao, W.: Modern Cryptography: Theory and Practice. Prentice-Hall, Englewood Cliffs (2003) 21. Pointcheval, D., Stern, J.: Security Arguments for Digital Signatures and Blind Signatures. Journal of Cryptology 13, 361–396 (2000) 22. Shin, J.-B., Lee, K., Shim, K.: New DSA-Verifiable Signcryption Schemes. In: Deng, R.H., Qing, S., Bao, F., Zhou, J. (eds.) ICICS 2002. LNCS, vol. 2513, pp. 35–47. Springer, Heidelberg (2002) 23. Steinfeld, R., Zheng, Y.: A Signcryption Scheme Based on Integer Factorization. In: Okamoto, E., Pieprzyk, J.P., Seberry, J. (eds.) ISW 2000. LNCS, vol. 1975, pp. 308–322. Springer, Heidelberg (2000) 24. Yang, G., Wong, D.S., Deng, X.: Analysis and Improvement of a Signcryption Scheme with Key Privacy. In: Zhou, J., Lopez, J., Deng, R.H., Bao, F. (eds.) ISC 2005. LNCS, vol. 3650, pp. 218–232. Springer, Heidelberg (2005) 25. Yang, G., Wong, D.S., Deng, X., Wang, H.: Anonymous Signature Schemes. In: Yung, M., Dodis, Y., Kiayias, A., Malkin, T.G. (eds.) PKC 2006. LNCS, vol. 3958, Springer, Heidelberg (2006) 26. Zheng, Y.: Digital Signcryption or How to Achieve Cost (Signature & Encryption) << Cost (Signature) + Cost (Encryption). In: Kaliski Jr., B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 1165–1793. Springer, Heidelberg (1997)
A
Proving the Equality of Two Discrete Logarithms
Let α and β be two independent generators of order q in modular p. A prover P tries to prove to a verifier V that two numbers a = αx and b = β x have the same exponent without exposing x. We denote this proof as ZKP (x) : (logα a = logβ b = x). Based on the scheme by Chaum and Pedersen [7] and Fiat-Shamir’s heuristics [12] the non-interactive proof can be done as follows.
Secret Signatures: How to Achieve Business Privacy Efficiently?
47
ZKP (x) : (logα a = logβ b = x) – Proof: Prover P randomly chooses t from Z∗q and computes c = αt and d = β t . He computes h = H(α, β, a, b, c, d) and s = t + hx, then sends (c, d, s) to the verifier. Proof requires two exponentiation operations. – Verification: Verifier V first computes h = H(α, β, a, b, c, d). Then he checks ?
?
αs = cah and β s = dbh . Verification requires four exponentiation operations.
B
OR Proving the Equality of Two Discrete Logarithms
Let α1 , β1 , α2 , β2 be four independent generators of order q in modular p. A prover P tries to prove to a verifier V that either logα1 a1 = logβ1 b1 (= x1 ) or logα2 a2 = logβ2 b2 (= x2 ) holds using his knowledge of x1 or x2 without exposing it. It is an OR combination of two proofs for the equality of two discrete logarithms. We denote this proof as ZKP (x1 ∨ x2 ) : (logα1 a1 = logβ1 b1 = x1 ) ∨ (logα2 a2 = logβ2 b2 = x2 ). The prover knows either x1 or x2 , but does not know them all. This proof can be done as follows. ZKP (x1 ∨ x2 ) : (logα1 a1 = logβ1 b1 = x1 ) ∨ (logα2 a2 = logβ2 b2 = x2 ) – Proof: Assume that the prover P knows xb and does not know xb0002 . • Randomly chooses rb , sb0002 , tb0002 from Z∗q . s t s t • Computes cb = αrbb , db = βbrb , cb0002 = αb0002b0002 abb0002 0002 , db0002 = βb0002b0002 bbb0002 0002 (incurring six exponentiation operations). • Computes t = H(α1 , β1 , α2 , β2 , a1 , b1 , a2 , b2 , c1 , d1 , c2 , d2 ). • Computes tb = t − tb0002 and sb = rb − tb xb , then sends (c1 , d1 , c2 , d2 , s1 , t1 , s2 , t2 ). – Verification: Verifier V checks ? ? ? ? • c1 = αs11 at11 , d1 = β1s1 bt11 , c2 = αs22 at22 , d2 = β2s2 bt22 and ?
• t1 + t2 = H(α1 , β1 , α2 , β2 , a1 , b1 , a2 , b2 , c1 , d1 , c2 , d2 ).
Implementation of BioAPI Conformance Test Suite Using BSP Testing Model Jihyeon Jang1 , Stephen J. Elliott2 , and Hakil Kim1 1
School of Information & Communication Engineering, Inha University [email protected], [email protected] 2 Department of Industrial Technology, Purdue University [email protected]
Abstract. The purpose of this paper is to design a Conformance Test Suite(CTS) for BSPs(Biometric Service Provider) based upon the BioAPI (Biometric Application Programming Interface) v2.0, an international standard by ISO/IEC JTC1/SC37. The proposed BioAPI CTS enables users to test BSPs without depending on various frameworks. In this paper, a test scheduling tool has been embodied in order to use Test Assertion with XML. In order to demonstrate the performance of the CTS, the experiment was performed using both commercial fingerprint verification and identification BSPs. The developed CTS will be installed at Korean National Biometrics Test Center and used to test whether commercial biometrics products are compliant to BioAPI. Keywords: BioAPI, Conformance Test Suite, Biometric Service Provider.
1
Introduction
Biometrics recognition technology is rapidly developing with numerous biometrics products entering the marketplace, and are becoming pervasive in some areas. Each vendor develops biometrics products in different technical schemes, and the market suffers from the problem of ‘One-Vendor Solution,’ which means that end user’s system becomes dependent on a particular vendor’s solution and nearly impossible to upgrade by other vendor’s solution. This problem is caused by the lack of standards in biometrics technology. Numerous studies have been conducted to overcome the problem of ‘One-Vendor Solution’, thus, the need for a standardized interface of biometrics products has increased. Since the establishment of ISO/IEC JTC1/SC37 in 2002, the standardization of biometrics technology has been actively progressed. One of the most important standards for biometrics is ISO 19784-1, BioAPI Specification (Biometric Application Programming Interface - Part 1: Specification). This standard defines the Application Programming Interface and Service Provider Interface (SPI) for standard interfaces within a biometric system consisting of components from multiple vendors. Commercial products are required to be compliant to this standard, hence, need to be tested for conformance. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 48–60, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
Implementation of BioAPI Conformance Test Suite
49
The Conformance Test Suite (CTS) is a means of identifying whether a product satisfies a corresponding standard. Once certified by the CTS, the product is deemed interoperable with standard-compliant devices manufactured by other vendors. The previous versions of BioAPI CTS initially developed by Korea Information Security Agency (KISA) in 2003 [1-3] and by NIST in 2006[4], were based on BioAPI v1.1 proposed by the Biometrics Consortium [5]. These versions indirectly tested whether BSPs (Biometric Service Provider) were compliant to the standard, using the consortium’s BioAPI framework which is a middleware infrastructure between BioAPI compliant applications and BSPs. The problem with these previous versions is that when an error occurs, the CTS cannot find whether it originated from the BSP under test or from the framework. In order to resolve the problem stated above, this study develops a new BioAPI CTS in the following technical aspects. Firstly, the new CTS which is based on BioAPI v2.0 standards [6] is able to test BSPs without the BioAPI framework. Secondly, it can load various test procedures arbitrarily designed depending on the purpose of the test. Thirdly, the new CTS report the test results in XML format. Finally, it is implemented to test each BSP differently, depending on the purpose of the BSP. The following section introduces the conformance test methods and models. The third section describes the design and implementation of the proposed CTS. The forth section concludes this paper and discusses further works.
2
Conformance Test Suite for BioAPI: Methods and Models [7]
Conformance test is defined as an evaluation of whether a developed product meets standard requirements. That is, it must provide a message to inform the vendor that the usage form of a product is incorrect or it is omitted items of standards specifications. Accordingly, the CTS must be composed so that it can evaluate whether a product appropriately applies the standard. 2.1
Conformance Testing Methods
As presented in Fig. 1, the conformance test of BioAPI is divided into a Function Test, to evaluate BioAPI functions, and a Scenario Test, to evaluate whether the BSP functions perform correctly. The function test evaluates whether mandatory and optional functions conform to the standard. If a function subject to be tested is called, function mapping proceeds in the framework and the BSP function is implemented through the relevant SPI (Service Provider Interface) function. The return values on each function implemented in the BSP are delivered to an application, and the result is output through test log after the performance process is finished. All BSPs support basic component management, handle, event and utility operations. The callback, database and BioAPI unit operations are optional. Second, the scenario test is a method of evaluating whether the BSP is capable of calling successive functions prescribed by the standard. The processes of the three types of scenario test are defined as the following.
50
J. Jang, S.J. Elliott, and H. Kim
Fig. 1. Test method structure for conformance test
Handling Scenario When a BSP creates a new BIR (Biometric Information Record), it returns a “handle” to it. The majority of local operations can be performed without moving the BIR out of the BSP. However, if the application is required to manage the BIR, it can acquire the BIR using the handle. In order to test the BIR creation function, it is mandatory to run BioAPI Capture prior to running other BIR functions because it needs the biometric image. After the image is captured using BioAPI Capture, the handling process is carried out by the BIR handle function.
Fig. 2. Test scenario for BIR handling
Verification Scenario he verification process matches the biometric template with an input image, where it compares the similarity. For verification, first the BSP or application should capture, and then determine the match/non-match with a stored template.
Implementation of BioAPI Conformance Test Suite
51
The Enrollment process should be tested before the Verification process. The verification process addresses the testing of whether BioAPI Verify and BioAPI Identify are called and BioAPI Enroll is called. Using BioAPI Caputer, an image is captured, the image process is carried out by the BioAPI Process. Identification work is achieved with BioAPI IdentifyMatching. This step is descried in Fig. 3.
Fig. 3. Test scenario for verification process
Fig. 4. Test scenario for identification process
52
J. Jang, S.J. Elliott, and H. Kim
Identification Scenario The identification process matches the entire stored template with input image, where it determines the best match. To verify the identification function, the following functions are required. For identification, first the BSP or application should capture, and then determine the match/non-match. The Enrollment process is tested before the Identification process. The Identification process addresses the testing of whether BioAPI Verify and BioAPI Identify are called after BioAPI Enroll is called. After the image is captured using BioAPI Capture, the image process is carried out with BioAPI Process. Identification work is conducted by BioAPI IdentifyMatching. 2.2
Conformance Testing Models
The conformance testing model is based on a variation of the basic BioAPI architecture. The basic BioAPI architecture is described as being comprised of a normal BioAPI application, BioAPI framework, and one or more normal BioAPI BSPs. The conformance testing methodology specified in the BioAPI specification, addresses each of the standard components of the BioAPI architecture separately. Three conformance testing models are defined the methodology, for the testing of each standard component.
Fig. 5. Conformance testing model for BioAPI application
Implementation of BioAPI Conformance Test Suite
53
Applications Testing Component Model As presented in Fig. 5, in the conformance testing model for BioAPI applications, the application-testing framework is inserted between the application under test and a normal framework. This testing component shall implement the BioAPI standard interface on one side and the application callback interface on the other side. As a result, it shall appear to the application as a framework, and to a framework as an application. This testing component shall have the ability to relay application calls to the normal framework. The framework calls the application during testing, and the ability to observe, analyze, log the flow of incoming calls, and generate extra calls are tested. Framework Testing Component Model Fig. 6 presents the conformance testing model for BioAPI frameworks, the framework-testing application replaces the normal application, and frameworktesting BSP shall replace the normal BSP. These two testing components lie in the framework for testing. The framework-testing application shall implement the application callback interface, and the framework-testing BSP shall implement the BioSPI interface. Therefore, the framework being tested cannot distinguish between these interfaces and the corresponding components. In addition, each testing component contains a special testing interface, enabling them to interact with each other for the purpose of performing tests.
Fig. 6. Conformance testing model for BioAPI frameworks
54
J. Jang, S.J. Elliott, and H. Kim
Fig. 7. Conformance testing model for BioAPI BSPs
BSP Testing Component Model As presented in Fig. 7, in the conformance testing model for BioAPI BSPs, the BSP-testing application shall replace the normal application and the normal framework. This testing component shall act both as a BioAPI application and BioAPI framework, and shall implement the framework callback interface. As a result, it appears as a framework to the BSP being tested. The testing component shall be able to make calls to the BioSPI interface of the BSP under testing.
3
Implementation of CTS for BioAPI BSP
The previous CTS indirectly tested whether BSPs were compliant to the standard, using the Biometrics Consortium’s BioAPI framework. However, when the CTS discovers an error, it does not distinguish whether they originate from the BSP under test or the BioAPI framework. Therefore, the new version of the CTS developed in this paper implements the BSP testing component model as shown in Fig. 7. The testing mechanism of the new CTS based on BioAPI v2.0 is illustrated in Fig. 8. In order to resolve the above problem, the following four functions are newly implemented in the comparison with the previous versions of CTS. Firstly, the
Fig. 8. Mechanism of the BioAPI v2.0 CTS
Implementation of BioAPI Conformance Test Suite
55
Table 1. Summary of BioAPI conformant BSPs
Fig. 9. Testing scheduling of BioAPI v2.0 CTS
new CTS is capable of testing a BSP which does not abide by the framework on the basis of the BioAPI v2.0 standards.Secondly, it can load arbitrarily designed test procedures by the test scheduler. The test scheduler will be replaced by a test assertion loader in the next version of the BioAPI CTS when ISO 24709-2, BioAPI Conformance Testing - Part 2: Test Assertions for BSPs [8] becomes an
56
J. Jang, S.J. Elliott, and H. Kim
international standard. Fig. 9 depicts the implementation of the test scheduler. Thirdly, the new CTS produces the test results in XML format which is the standard format for both the input and the output to the BioAPI CTS in ISO 24709-2. Finally, it is implemented to test each BSP differently, depending on the purpose of the BSP, for example, verification or identification. The following Table 1 is a summary of BioAPI conformant BSPs.
4
Experimental Results
In order to evaluate the operation of implemented BioAPI CTS, this paper tested BioAPI v2.0 with three commercial fingerprint recognition BSPs. Two tests of the BSPs were modules for verification [9, 10], and the others were for identification [11]. The BioAPI function and the associated BSPs are shown in Table 2. Fig. 10 presents the process of performing the BSP conformance test for fingerprint recognition. When the test is completed, the result is found in an XML
Table 2. Conformance of test BSPs
Implementation of BioAPI Conformance Test Suite
57
Fig. 10. Process of performing the BSP conformance test
test result, as shown in Table 3. If a BioAPI function exists in the BSP, the message of BioAPI OK is returned, while if it does not exist or is not performed normally, an error code is returned [6]. The result of BSP conformance test is provided in XML format, as shown in Table 3. However, BSPs tested in this paper only provided functions necessary for verification and identification, so it could not be confirmed whether the implemented CTS would operate properly with a BSP which provides all the functions defined in BioAPI v2.0. Thus, a sample BSP was developed, and all the functions of BioAPI were tested to simulate a full CTS experiment. Table 4 contains the results of two experiments, firstly, over three commercial BSPs for fingerprint recognition, and secondly, a test BSP developed in this study in order to test all the BioAPI functions. These experiments demonstrate that the BioAPI CTS performs properly over the BSPs supplied by manufactures. At the same time, it is recognized that the BSPs under test are compliant to BioAPI v2.0 standards.
58
J. Jang, S.J. Elliott, and H. Kim Table 3. Test Log of sample BSP
Implementation of BioAPI Conformance Test Suite Table 4. Test result of sample BSPs
59
60
5
J. Jang, S.J. Elliott, and H. Kim
Conclusions and Future Works
This paper designs and implements a conformance testing suite for BioAPI v2.0. The improved CTS can operate without the BioAPI frameworks. The weakness of the previous versions of BioAPI CTS was that they could not tell whether any test failure originates either from the BSP under test or from the BioAPI framework. The new BioAPI CTS is designed to be able to test BSPs and frameworks independently. The performance of the new BioAPI CTS was demonstrated by testing commercial fingerprint verification and identification BSPs. The developed BioAPI CTS will make the process of manufacturing standard biometrics products with less expense in shorter period of time. In addition, the biometrics products compliant to international standards will promote the interoperability among different vendor’s products and consequently enlarge the biometrics market worldwide. The current BioAPI CTS uses XML only for outputting the test results. For further progress, XML will be used for writing scripts of test scenario and a parser of XML test-scripts needs to be developed in order to make the conformancetesting process more flexible. Acknowledgments. This work was supported in part by the Korea Small and Medium Business Administration.
References 1. Korea Information Security Agency, Development of test technology for Korea BioAPI Standard (2002) 2. Korea Information Security Agency, A Study on Improvement of Conformance Test Suite Tool for Application Interface of Biometric System (2003) 3. Lee, Y.-Y., Kwon, Y.-B.: A Study on Conformance Testing Methods to Verify the BioAPI Based System Module. Korea Information Processing Society B 11(7), 759–768 (2004) 4. http://www.itl.nist.gov/div893/biometrics/BioAPI CTS/index.htm 5. Biometric Consortium, BioAPI Specification Version 1.1 (2001) 6. ISO/IEC IS 19784-1, Information Technology - Biometric application programming interface - Part 1: BioAPI specification 7. ISO/IEC FDIS 24709-1, Information Technology - Conformance Testing for BioAPI - Part 1: Methods and Procedures 8. ISO/IEC FDIS 24709-2, Information Technology - Conformance Testing for BioAPI - Part 2: Test Assertions for Biometric Service Providers 9. SecuTronix, http://www.secutronix.com/ 10. Suprema, http://www.supremainc.com/ 11. Digent, http://www.digent.co.kr/
Information Hiding in Software with Mixed Boolean-Arithmetic Transforms Yongxin Zhou, Alec Main, Yuan X. Gu, and Harold Johnson Cloakware Inc., USA {yongxin.zhou,alec.main,yuan.gu,harold.johnson}@cloakware.com
Abstract. As increasingly powerful software analysis and attack tools arise, we need increasingly potent software protections. We generate an unlimited supply of obscuring transforms via mixed-mode computation over Boolean-arithmetic (mba) algebras corresponding to real-world functions and data. Such transforms resist reverse engineering with existing advanced tools and create np-hard problems for the attacker. We discuss broad uses and concrete applications to aacs key hiding and software watermarking.
1
Introduction
With the increasing power of software analysis and attack tools and the ubiquity of open operating systems, ever stronger software data- and algorithm-hiding mechanisms are essential. (For existing protections, see, e.g., [3,4,5,6,7,10,18,21].) We introduce Boolean-arithmetic (ba) algebras which model real-world software computation. Modern alus use (1) 2’s complement arithmetic on n-bit words, which maps to the modular ring Z/(2n ), and (2) bitwise operations over Bn where B = {0, 1}. Combining (1) and (2) gives the ba-algebra BA[n], over which we define highly simplification-resistant mixed Boolean-arithmetic (mba) obfuscating transforms with np-hard fragment recognition, for which we provide unlimited-volume generators, ensuring an ever-growing mba protections database. mba transforms based on mba expressions, mba identities and invertible functions can be used to transform software code: functionality is preserved, but secret constants, intermediate values, and algorithms are hidden from static and dynamic reverse engineering and analysis. Perimeter software protection methods, that ultimately allowed the data or algorithm to appear in memory cannot provide a similar level of protection. Moreover, transformed software simultaneously occupies multiple mathematical domains, creating worst-case np-hard problems for the attacker, and making them highly resistant even to advanced analytical tools such as MapleTM or MathematicaTM . The unlimited supply of mba transforms can be generated ensuring an ever-growing search space to protect against new tools for analyzing and attacking transformed software. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 61–75, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
62
Y. Zhou et al.
We provide practical threat scenarios. We then introduce mba transforms and their use for protection against such threats. Lastly, we provide practical examples of transformed algorithms.
2 2.1
Motivating Scenarios Na¨ıve Code
In Fig. 1(A)’s scenario, both algorithm and constant data C are unprotected, permitting algorithmic reverse engineering by static analysis and extraction of C by searching the executable file. This is the normal state of software as written, compiled, and delivered. 2.2
Hiding Constants from Static Analysis
Hiding C in code gives us Fig. 1(B). The algorithm is still exposed to static analysis, but C (created at runtime) no longer appears in the executable file. This (without mba methods) was the situation in the Advanced Access Control System (aacs: see Fig. 2), a copy protection system adopted for hd-dvdTM and BluRayTM optical discs, which was hacked[22] by various parties by initially obtaining the title key, Kt , from memory, then working back through the chain to obtain earlier keys from memory, such as the media key, Km . Since the algorithms are public, obtaining the keys allows for off-line decryption of the licensed content by an application that does not comply with licensing terms (e.g. writes decrypted content to a file). Obtaining the higher level keys allows the others to be calculated, making the memory-key-extraction step simpler. As in Fig. 1(B), while the software manufacturers made an effort to hide the device keys, they did not prevent intermediate values from appearing in memory. 2.3
Hiding Constants and Algorithms from Dynamic Analysis
Finally, in Fig. 1(C)’s scenario, we protect the value of C and its associated code by applying code and data transforms[3,10,21]. The constant code generates C T (C under transformation T ) with corresponding changes to the algorithm. Even if a dynamic attack recovers C T , an attacker must reverse-engineer of the transformed algorithm to find T . E.g. if C is a cryptographic key, offline decryption is infeasible until T is found or the algorithm code is lifted, but transform diversity[21] makes lifting less useful. Moreover, lifting could be made difficult if the code surrounding decryptions were mathematically intermixed with the decryptions using mba transforms. Example: AACS. The above is the approach we would recommend for systems such as aacs: using mba transforms would ensure that only transformed key values appear in memory or registers during computation, with a transform not useful for offline decryption without significant additional analysis.
Information Hiding in Software with Mixed Boolean-Arithmetic Transforms
(A)
Input( y, y1, .. , ym ) Program
(B)
Input( y, y1, .. , ym )
Input( y, y1, .. , ym ) Program Constant Code
Program
C
(C)
63
Constant Code
CT
Algorithm Code
C
Output( z, z1, .. , zn )
Algorithm Code
Algorithm Code Output( z, z1, .. , zn ) Output( z, z1, .. , zn ) Fig. 1. Basic Scenarios
A possible attack point would be to compare known title keys (obtained from previously hacked titles) with the corresponding transformed keys. To render such an attack ineffective, the transformation must be sufficiently complex that determining its functionality from such isolated plain-key-to-transformed-key instances is infeasibly hard for the system’s anticipated attackers. AACS HD−DVD Volume ID
HD−DVD MKB Device Keys & Process MKB Algorithm
Encrypted Key
AESG Km
Kvu
AES Decrypt
Encrypted Content
Kt
AES Decrypt
Content Fig. 2. AACS
Example: Software Watermarking. Another possible use for key and algorithm hiding is software watermarking in order to prove ownership of the software or intellectual property contained therein if the software is used without a license. Software watermarking is not to be confused with steganographic watermarking, where a message is hidden images or an audio or video stream, although some systems (e.g., drm systems) permit both forms of protection to be used (software watermarking for drm components, steganographic watermarking for the content they manage).
64
Y. Zhou et al.
Q: if I K then K = E (K) W = emit_watermark ( P , K ) output W else P (I) Fig. 3. Watermarking
In §4.2, we will provide a simple example where software function is dependent on a secret key K which can be used to prove software ownership. mba transforms effectively protect the software watermark algorithm by making it very hard to discover K and by interlocking (see §3.4) the constant code producing K and the watermark algorithm (see Fig. 3).
3 3.1
Mixed Boolean-Arithmetic (MBA) Transforms Basic Definitions
Microprocessor arithmetic logic units (alus) use arithmetic operations including addition +, subtraction − (whence comparisons <, ≤, =, ≥, > ), multiply · , left shift 0004, arithmetic right shift 0005s , and logical right shift 0005. Bitwise operations include exclusive-or ⊕, inclusive-or ∨, and ∧, and not ¬. With n-bit 2’s complement integers, arithmetic operations are in integer modular ring Z/(2n ). Bitwise operations operate over Boolean algebra (Bn , ∧, ∨, ¬). All above computations are captured in the ba-algebra, BA[n]. Definition 1. With n a positive integer and B = {0, 1}, the algebraic system (Bn , ∧, ∨, ⊕, ¬, ≤, ≥, >, <, ≤s, ≥s , >s , <s , =, =, 0005s , 0005, 0004, +, −, · ), where 0004, 0005 denote left and right shifts, · (or juxtaposition) denotes multiply, and signed compares and arithmetic right shift are indicated by s , is a Boolean-arithmetic algebra (ba-algebra), BA[n]. n is the dimension of the algebra. BA[n] includes the Boolean algebra (Bn , ∧, ∨, ¬), the integer modular ring Z/(2n ), Galois field gf(2n ), and p-adic numbers[19]. The first two structures are most used in real applications and form the basis of our techniques for hiding keys (constants) and operations over BA[n]. Definition 2. With BA[n] a ba-algebra and t a positive integer, a function f: (Bn )t → Bn of the form ⎛ ⎞ 0005 0002 ai ⎝ ei,j (x1 , . . . , xt )⎠ , i∈I
j∈Ji
Information Hiding in Software with Mixed Boolean-Arithmetic Transforms
65
where ai are constants, ei,j are bitwise expressions of variables x1 , . . . , xt over Bn , and I, Ji ⊂ Z, are finite index sets, ∀i ∈ I, is a polynomial mixed Booleanarithmetic expression, abbreviated to a polynomial mba expression. (If each of x1 , . . . , xt is itself a polynomial mba expression of other variables, the composed function is likewise a polynomial mba expression over Bn .) Each non-zero summand in the expression is a term. A linear mba expression is a polynomial mba expression of the form 0002 ai ei (x1 , . . . , xt ), i∈I
where ei are bitwise expressions of x1 , . . . , xt and ai are constants. Two examples of polynomial mba expressions over BA[n] are: f (x, y, z, t) = 8458(x ∨ y ∧ z)3 ((xy) ∧ x ∨ t) + x + 9(x ∨ y)yz 3 , f (x, y) = x + y − (x ⊕ (¬y))) − 2(x ∨ y) + 12564. The latter is a linear mba expression. As indicated in [20], all integer comparison operations can be represented by polynomial mba expressions with results in their most significant bit (msb). For example, the msb of (x − y) ⊕ ((x ⊕ y) ∧ ((x − y) ⊕ x)) s
is 1 if and only if x < y. 3.2
Linear MBA Identities and Expressions
We now show the existence of an unlimited number of linear mba identities. We use truth tables, where the relationship of variables of the expression in the table and conjuncts is shown by example in Table 1. Bn for i = Theorem 1. Let n, s, t be positive integers, let xi be variables over s−1 1, . . . , t, let ej be bitwise expressions on xi ’s for j = 0, . . . , s−1. Let e = j=0 aj ej be a linear mba expression, where aj are integers, j = 0, . . . , s − 1. Let fj be the deduced Boolean expression from ej , and let (v0,j , . . . , vi,j , . . . , v2t −1,j )T be the column vector of the truth table of fj , j = 0, . . . s − 1, and i = 0, . . . , 2t − 1. Let A = (vi,j )2t ×s , be the {0,1}-matrix of truth tables over Z/(2n ). Then e = 0 if and only if the linear system AY = 0 has a solution over ring Z/(2n ), where Ys×1 = (y0 , · · · , ys−1 )T is a vector of s variables over Z/(2n ). Proof. If e = 0, (a0 , a1 , · · · , as−1 )T is plainly a solution of the linear system. Assume a solution exists. Let zji represent the i−th bit value of ej , j = 0, 1, · · · , s − 1, i = 0, 1, · · · , n − 1. Truth tables run over all inputs, so row vectors of matrix A run over all values of i-th bit vector (z0,i , . . . , zs−1,i ) in the Boolean expressions, and via the above solution, s−1 j=0 aj zj,i = 0, for i = 0, . . . , n − 1. n−1 From the arithmetic point of view, ej = i=0 zj,i 2i . Thus we have s−1 0002
aj e j =
j=0
as required.
s−1 n−1 0002 0002 j=0 i=0
aj zj,i 2i =
n−1 s−1 00020002 i=0 j=0
aj zj,i 2i =
n−1 0002 i=0
s−1 0002 2i ( aj zj,i ) = 0, j=0
2
66
Y. Zhou et al.
By this theorem, any {0, 1}-matrix (vij )2t ×s with linearly dependent column vectors generates a linear mba identity of t variables over Bn . For example, the {0, 1}-matrix ⎛ ⎞ 00011 ⎜0 1 1 1 1 ⎟ ⎟ A=⎜ ⎝1 0 1 1 1 ⎠ 11101 with column-vector truth-tables for f0 (x, y) = x, f1 (x, y) = y, f2 (xy ) = x ∨ y, f3 (x, y) = ¬(x∧y), f4 (x, y) = 1, respectively, with solution (1, 1, −1, 1, −1)T over Z/(232 ), generates linear mba identity x + y − (x ∨ y) + (¬(x ∧ y)) − (−1) = 0, using x1 = x, x2 = y, and −1 = all 1’s. The interested reader can write C code to verify the identity for any x, y ∈ B32 by defining x and y as unsigned or signed 32-bit integers. Other linear mba identities can be found in [20]. The following theorem states that any Boolean function has non-trivial linear mba expressions. It allows us to embed hard Boolean functions into mba transforms. Table 1. Truth Table for x1 ∨ (x2 ⊕ (¬x3 )) Conjunction Binary Result (¬x1 ) ∧ (¬x2 ) ∧ (¬x3 ) 000 1 (¬x1 ) ∧ (¬x2 ) ∧ ( x3 ) 001 0 (¬x1 ) ∧ ( x2 ) ∧ (¬x3 ) 010 0 (¬x1 ) ∧ ( x2 ) ∧ ( x3 ) 011 1 ( x1 ) ∧ (¬x2 ) ∧ (¬x3 ) 100 1 ( x1 ) ∧ (¬x2 ) ∧ ( x3 ) 101 1 ( x1 ) ∧ ( x2 ) ∧ (¬x3 ) 110 1 ( x1 ) ∧ ( x2 ) ∧ ( x3 ) 111 1
Theorem 2. Let e be a bitwise expression of m variables over Bn . Then e has m a non-trivial linear mba expression. That is e = 2i=0−1 ai ei , where each ai ∈ Bn , each ei is a bitwise expression of the m variables, and e = ei in Bn , for i = 0, 1, . . . , 2m − 1. Proof. Let P2m ×1 be the column vector of the truth table of the deduced Boolean expression from e. Pick an invertible {0, 1}-matrix A2m ×2m over Z/(2n ). If a column vector of A is P , add another column vector to it. Then we have an invertible matrix A with all column vectors distinct from P . Suppose the solution of linear equation AY = P2m ×1 is Y = (y0 , y1 , . . . , y2m −1 )T . Let matrix Q2m ×(2m +1) = (P, A), and X(2m +1)×1 = (−1, y0 , y1 , . . . , y2m −1 )T . Because AY = P , it is easy to show that QX = 0. Following the disjunctive normal form (or any standard form) of Boolean functions, for each column vector Ai of matrix A we have a unique Boolean expression ei of m variables with Ai being its truth table, i = 0, 1, . . . , 2m − 1.
Information Hiding in Software with Mixed Boolean-Arithmetic Transforms
Since QX = 0, Theorem 1 gives a linear mba identity e − By our choice of A, ei = e for all i. 3.3
2m −1 i=0
67
yi ei = 0. 2
Permutation Polynomials and Other Invertible Functions
We use polynomial functions over Z/(2n ) to generate polynomial mba expressions. We need invertible polynomials[15,13,12], and both the polynomial and its inverse must be of limited degree so that polynomial code transformations are efficient. We now show that there are many such invertible polynomials. Theorem 3. Let m be a positive integer and let Pm (Z/(2n )) be a set of polynomials over Z/(2n ): m
0002 n i n 2 ai x ∀ai ∈ Z/(2 ), a1 ∧ 1 = 1, ai = 0, i = 2, . . . , m . Pm (Z/(2 )) = i=0
Then (Pm (Z/(2n )), ◦) is a permutationgroup under the functional composition m m operator ◦. For every element f (x) = i=0 ai xi , its inverse g(x) = j=0 bj xj can be computed by bm = −a−m−1 am , 1
000e 000f m j j−k ak − a−1 Aj , m − 1 ≥ k ≥ 2, bk = −a−k−1 1 1 j=k+1 k a0 m j−1 −1 −1 b 1 = a1 − a1 j=2 ja0 Aj , m b0 = − j=1 aj0 bj ,
where Am = −a−m 1 am , and Ak are recursively defined by Ak = −a−k 1 ak −
m 000e 000f 0002 j j=k+1
k
aj−k Aj , 0
for 2 ≤ k < m.
Proof. By definition of Pm (Z/(2n )), the coefficient of x is odd, and coefficients of higher degrees are even. By [15], they are permutation polynomials. To show group, we compute all coefficients of the composition g(x) ◦ (Pm (Z/(2n )), ◦) is a m m j i f (x) = g(f (x)) = b f (x) of any two elements f (x) = a x and j=0 j i=0 i m j n g(x) = j=0 bj x ∈ Pm (Z/(2 ). For any j ∈ {2, . . . , m}, we have bj f (x)j = (i1 ···ij ),ik ∈{0,..,m},k=0,..,j bj ai1 · · · aij xi1 +···+ij = (i1 ···ij ),ik ∈{0,1},k=0,..,j bj ai1 · · · aij xi1 +···+ij (since bj ai = 0, ∀i ≥ 2) 000e 000f = bj jk=0 jk ak1 aj−k xk ; 0 j 000e 000f m m xk g(f (x)) = b0 + i=0 b1 ai xi + j=2 bj k=0 jk ak1 aj−k 0 000e 000f m m = b0 + k=0 (b1 ak + j=2,j≥k bj jk ak1 aj−k )xk 0 000e 000f 000e 000f m m j j−1 = j=0 a0 bj + j=1 ja0 a1 bj x 000e 000e 000f 000f m m + k=2 ak b1 + j=k jk aj−k ak1 bj xk . 0
68
Y. Zhou et al.
i Let m i=0 ci x denote g(f (x)). Then c1 is odd since both a1 and b1 are odd, and all bj , j ≥ 2, are even. For all ak and bj , k, j ≥ 2, a2k = b2j = 0, so we have c2i = 0, i ≥ 2. Therefore g(f (x)) ∈ Pm (Z/(2n )) and Pm (Z/(2n ) is a group. Let us compute the inverse of f (x). Assume g(f (x)) = x, which implies c1 = 1 Similarly, and ci = 0, i = 0, 2, · · · , m. From cm = 0 we have bm = −a−m 1 am b 1 . 000e 000f m −k . for all k, m > k ≥ 2, ck = 0 implies bk = −a1 ak − j=k+1 bj jk aj−k 0 Observing that the second term of bk is defined by bj , j > k, and b1 is a common factor of all these bk starting at bm , we can define bk = b1 Ak such that all Ak −m can be computed from 000e 000f Am , Am−1 , . . . , Ak+1 ; i.e., Am = −a1 am , and Ak = m j j−k Aj , from known coefficients of f (x). From c1 = 1 −a−k 1 ak − j=k+1 k a0 m m j−1 −1 −1 we obtain b1 = a1 (1 + j=2 jaj−1 = a−1 0 Aj ) 1 (1 − j=2 ja0 Aj ). The latter identity holds because A2k = 0 for all k, m > k ≥ 2. Based on this formula for b1 and the identities bk = b1 Ak , and c0 = 0, we can compute all other coefficients recursively: bm = −a−m−1 am , and for all k with m > k ≥ 2, 1 000e 000f m m j−1 j j−k −k Aj ) bk = a−1 1 (1 − j=2 ja0 Aj )(−a1 ak − j=k+1 k a0 000e 000f m j j−k ak − a−1 Aj . = −a−k−1 1 1 j=k+1 k a0 We then easily derive b0 = −
m
j=1
aj0 bj .
2
P1 (Z/(2n )) is used for data transforms in [10]. Formula of polynomial inverses of polynomials in P2 (Z/(2n )) is given in [21]. Following Theorem 3, for cubic polynomial f (x) = a3 x3 + a2 x2 + a1 x + a0 in P3 (Z/(2n )), its inverse is −3 −4 3 2 f −1 (x) = (−a−4 1 a3 )x + (−a2 a1 + 3a0 a1 a3 )x −1 −3 −1 2 −4 3 −4 +(a1 + 2a0 a1 a2 − 3a0 a1 a3 )x − a0 a1 − a20 a−3 1 a2 + a0 a1 a3 .
The above theorem implies that all encoding, decoding and composition functions are in conveniently similar formats for generating code transforms. Over BA[n], there are other types of invertible functions that can be used together with mba identities and permutation polynomials. For example, Z/(2n ) invertible matrices can be composed with invertible polynomials to form polynomial matrices and matrix polynomials. The T-functions of [11,12] provide another example. 3.4
Code Transforms Via Zero and Invertible MBA Functions
Operating on transformed data requires that the data and code (see Fig. 1(A)) use corresponding transforms (see Fig. 1(C)). mba transforms constructed using zero functions (functions returning zero irrespective of their inputs) and invertible functions achieve this while interlocking and hiding the original operations. Linear mba identities and permutation polynomials over BA[n] are our main resources.
Information Hiding in Software with Mixed Boolean-Arithmetic Transforms
69
There are two basic methods to hide operation in polynomial mba expressions. The first one is to treat code (possibly a single operation or variable) as a subexpression in a polynomial mba zero function and replace it with the negated sum of the remaining terms (i.e., it performs an identity substitution k−1 n n tk = − i=1 ti + i=k+1 ti derived from a zero function of the form i=1 ti = 0). The other method uses functional compositions of mba functions and the first method. For example, given two invertible functions S and T , function f is S −1 ◦ S ◦ f ◦ T ◦ T −1 . The associative law of functional composition allows us to write f as S −1 ◦ (S ◦ f ◦ T ) ◦ T −1. Applying the first method to (S ◦ f ◦ T ) before compositions we obtain a new expression of f ; i.e., we compute on ‘encrypted’ data with ‘encrypted’ functions[16,17]. A key technique is interlocking: (1) reverse partially evaluate: at Y, replace a function f : A → C with a function g: A × B → C such that g( · , b) = f ( · ) but g( · , x) where x = b gives a nonsense result and b (absent tampering) comprises value(s) computed at X, and then (2) make code at site Y highly dependent on code at site X by using values produced or obtained at X as coefficients in Y’s transforms and finally (3) apply the remaining protections in this paper to make the interlock extremely hard to eliminate. Removing such interlocks requires that the attacker fully understand what has been done (highly nontrivial, as we argue in §5), and tampering at X produces chaotic behavioral changes at Y . Thus dense interlocking renders code aggressively fragile under tampering. Interlocks also foil code-lifting. Compositions of zero and/or invertible functions with original operations in specific orders give us desired polynomial mba expressions, as shown in the proof below. Proposition 1. Let m be a positive integer. Then every operation in ba-algebra BA[n] = (Bn , ∧, ∨, ⊕, ¬, ≤, ≥, >, <, ≤s, ≥s , >s , <s , =, =, 0005s , 0005, 0004, +, −, · ) can be represented by a high degree polynomial mba expressions of multiple terms of m variables over Bn . Proof. It suffices to show that each operation can be represented by a linear mba identity of at least two terms, since then we can apply operation and composition transforms with invertible polynomials according to Theorem 3 to obtain high degree polynomial mba expressions of any desired number of variables. Any single variable x over Bn can be represented by a linear mba expression using any number of variables: this follows immediately from Theorem 2 by using x’s truth table. If we have x = i∈I ei , where ei are bitwise expressions and y = t where t are bitwise expressions, then x ± y is a linear mba expression s s s∈S and by Z/(2n ) distributivity, xy is in a polynomial mba expression. So is 0004 : it is a multiplication. For right shifts we have x 0005s y = −((−x − 1) 0005s y) − 1 and x 0005 y = −1 − ((−x − 1) 0005 y) − (((−1) 0005 y) ⊕ (−1)). Theorem 2 indicates that all bitwise operations can be a linear mba expression with at least two terms. Formulas in [20] express arbitrary signed and unsigned comparisons as bitwise operations with subtraction. We then apply Theorem 2 to obtain expressions of more that two terms. (This construction is one of many due to BA[n]’s rich mathematical structure.) 2
70
4 4.1
Y. Zhou et al.
Protection Methods Simple Constant Hiding Using MBA Transforms
In many applications, important information is represented by constants (keys). Here, we provide methods to turn them into executable code which computes the keys irrespective of its inputs, letting us seamlessly embed keys in applications. wlog, we consider n-bit key hiding in BA[n] only: if the key size bigger than n, we simply generate multiple n-bit keys which are concatenated to obtain the big key. We start with the following result to show that for any constant over Bn , there is a polynomial mba expression computing that constant. Proposition 2. For any constant K in BA[n] and any positive integer m there is a multiterm polynomial mba function f with f (x1 , . . . , xm ) = K for arbitrary x1 , . . . , xm in BA[n]. Proof. We use quadratic polynomial and linear mba identities to construct f . By Theorem 3, with a = 0, we have an invertible polynomial p(x) = ax2 + bx + c and its inverse q(x) = αx2 + βx + γ. Theorem 1 tells us that any singular matrix of size 2m × t, with s, t positive integers, produces an m-variable linear mba r identity. Let i=1 ai ei = 0 be an identity with multiple non-zero terms. Then r K = q(p(K)) = q( i=1 ai ei +p(K)). Expanding the expression after rearranging r terms in i=1 ai ei +p(K) yields an m-variable polynomial mba expression. (This construction is one of many due to BA[n]’s rich mathematical structure.) 2 The variable inputs obfuscate the code, and its mba expression frustrates simplification. The variables are typically shared with the remainder of the application. Appendix A gives a practical example. 4.2
Algorithm and Data Hiding Example: Software Watermarking
As noted in §2.3, systems such as aacs can greatly benefit from mba-based algorithm and data hiding. As another example, we now show that they can provide a secret watermark to prove identity of software[6], where a watermark is an n-bit constant W , its extraction key is a k-bit constant K, where k is large, and we must embed W in program P so that we can reliably extract it under K, but an attacker (ignorant of K) cannot obliterate it. (To make n and k large, we can compose W from W1 , . . . , Ww and K from K1 , . . . , Kw ; i.e., we can use multiple sub-watermarks and multiple corresponding sub-keys. The methods should be obvious: we need not discuss this further.) Watermark Injection. We can represent W and 2n − W as multiterm polynomial mba expressions by Proposition 2: 0002 0005 as es,j (x1 , . . . , xp ), W = f (x1 , . . . , xp ) = s∈S
2n − W = g(y1 , . . . , yq ) =
j∈Js
0002 i∈I
ai
0005
ei,j (y1 , y2 , . . . , yq ),
j∈Ji
for any integer variables x1 , . . . , xp , y1 , . . . , yq ∈ Bn .
Information Hiding in Software with Mixed Boolean-Arithmetic Transforms
71
Suppose P has an mba expression 0002 0005 h(z1 , . . . zr ) = at et,j (z1 , . . . zr ), t∈T
j∈Jt
as an intermediate computation. For example, h could be any single BA operation, which can be represented by multiterm polynomial mba expressions due to Theorem 3. We can embed watermark W into h based on h = W + h + (2n − W ). wlog, assume index sets I, S and J are disjoint. Since x1 , . . . , xp , y1 , . . . , yq are random variables, we can choose some of them from the set {z1 , . . . , zr }. Therefore, the intersection of variable sets {y1 , . . . , yq } ∩ {x1 , . . . , xp } ∩ {z1 , . . . , zr } maybe nonempty. Represent all with the new set {u1 , . . . , uw } = {y1 , . . . , yq } ∪ {x1 , . . . , xp } ∪ {z1 , . . . , zr }. Let σ be any permutation of index set I ∪ S ∪ J. We have 0005 0002 aσ(t) eσ(t),j (u1 , u2 , · · · , uw ). h= σ(t)∈I∪S∪J
j∈Jσ(t)
Watermark W is computed by adding all terms with indices in σ(S) and permutation σ is the watermark key K. To further obfuscate the watermark, apply permutation polynomial p to the watermark expression to mix its terms with terms in h. Then the table-concatenation of permutation σ and the the inverse permutation polynomial p−1 ’s coefficients is watermark key K. Following this general watermarking method, we can embed constant watermarks into any BA operations and mba expressions. The embedding is stealthy because recovering a watermark without the key is an instance of the npcomplete Subset Sum problem, and can be made hard in practice by applying the protections of §3.4. Watermark Extraction. For any program P there are likely to be inputs which are extremely improbable in normal use. Let P be a program containing an injected watermark W under key K as described above. At this point, extracting W from the executable code is awkward, and the techniques of this very paper could be used to obscure it and make it effectively impossible to extract. Instead, the program extracts its own watermark by re¨ using the above polynomial computations. (Generating code to do this which we are building the code is easy.) Let an extremely improbable lineup of inputs for P be K. Find an encoding or encodings E according to §3.4 such that E(K) = K (an easy problem). Then, replace P with P0006 which, if its input-lineup I = K, computes K = E(K) and uses K to extract and output W , but in all other circumstances, computes P(I) as in the original program (see Fig. 3). Now, encode P0006 according to §3.4, using interlocking to render the watermarking aspects of the code highly fragile under tampering, and making the normal behavior of P0006 highly dependent on the watermarking code, producing
72
Y. Zhou et al.
final program Q, the final watermark-protected version of P, with a watermark W protected under the new encoded key K. Extensions to interactive or transaction programs and the like are straightforward and left as an exercise. Watermark Robustness. The watermark in final watermarked program Q above is protected as follows. The attacker does not know K and so cannot directly attack the response to K. Attempting to otherwise change the i/o behavior of the program produces chaotic results: useless to the attacker since it simply destroys the program instead of obliterating the watermark. Further complicating the program by the methods in this paper does not obliterate the watermark, since so protecting programs by transforms and identities preserves functionality and the watermark extraction facility is a (very well hidden) part of that functionality. Of course, the attacker could add another such watermark, but that would not obliterate the original one.
5
Security of
MBA
Transforms
By experiment, analytical math tools (MathematicaTM , MapleTM ) can’t simplify most mba expressions. Moreover, the following problems are np-hard: 1. BA[n]-sat: for a given polynomial mba function f (x1 , . . . , xm ) over BA[n], find values a1 , . . . , am such that f (a1 , · · · , am ) = 0; 2. BA[n]-recog: for polynomial mba functions f (x1 , . . . , xm ), g(x1 , . . . , xm ) over BA[n], find values a1 , . . . , am such that f (a1 , . . . am ) = g(a1 , . . . , am ). Proposition 3. BA[n]-satand BA[n]-recog are np-complete. Proof. The result follows trivially from np-completeness of Boolean sat.
2
(Efficient Boolean sat-solvers can’t directly solve these. Boolean sat is a small subset of BA[n]-sat: the above problems require solvers for vectors of mutually constrained Boolean sat-problems.) Average-case complexity theory is too preliminary for direct security proofs: in the Average Case Complexity Forum[2], the papers list was last updated in March of 2000; of roughly 1200 papers added to the Electronic Colloquium on Computational Complexity[8] since 1994, about one percent seem to be on average case complexity. We must argue our case by other means. The best available theory on mbas is provided in this very paper. We provide easy ways to complicate, but not to simplify, programs. Consider a program whose computations are interlocked and encoded according to §3.4. Then consider any subexpression repeatedly transformed via the constructions of Theorems 1 and 2. Plainly the program interlocking and encoding and have vastly many choices, and by the nature of the constructions the identity-substitutions can each have vastly many choices as well. The simplification problem for such programs is the problem of reversing the above process; the vast numbers of choices at every step make its search space
Information Hiding in Software with Mixed Boolean-Arithmetic Transforms
73
vast. Indeed, as expressions grow during this process, the increase in choices with increasing numbers of steps is hyperexponential. We don’t expect efficient simplifiers for such massively complex constructions any time soon. MapleTM , MathematicaTM , or similar foreseeable tools, need large databases of known identities and laws. As we have argued above, for mbas, we should change ‘large’ to ‘extremely large’, and probably to ‘infeasibly large’; methods depending on simple laws such as Karnaugh-map Boolean simplification are inadequate. New characterizations of np problems, such as pcp, might eventually offer attack methods — but probably not soon. Without an efficient simplifier, the limitless variety of polynomial mba representations for any given BA operation, and the code composed of such operations, becomes an insurmountable BA[n]-recog function recognition burden for the attacker: the attacker must recognize the code functionality, but the search space for the attacker is vast and grows over time as increasing numbers of transforms are generated. We can apply other protections before and/or after those in this paper: see [14] for an excellent survey. Such techniques are important to provide defense in depth and to address other attacks, for example, tampering or spoofing of system calls. This paper’s methods proposed ensure viability of such additional protections by obviating any need for specialized hardware.
6
Conclusion
We have introduced powerful new methods for protecting constants, data, and code in software by converting computations into mixed Boolean-arithmetic computations, whether linear or polynomial, over ba-algebras, which provide a rich algebraic system based on the functionality of real-world computer instructions. Such protections are not penetrable using existing analytical tools such as MapleTM or MathematicaTM , since they combine multiple algebraic systems into one exceedingly complex system. By basing our protections on ordinary computer instructions, we avoid the vulnerabilities of methods of protection which require code to appear in unexecutable forms or in forms which require an interpreter or a substantial supporting library. These mba transform methods provide open-ended generators for identities and transformations which provide an effectively unlimited supply of encodings for data and code, making the search-space for attackers very large, and increasingly large over time, as we accumulate more and more vast sets of identities and transforms. We have described methods for hiding constants in code and for protecting code by keys, so that tampering with the code, with high probability, causes the code to malfunction, and given practical examples to demonstrate our methods. Acknowledgements. The authors thank their colleagues Phil Eisen, Clifford Liem, and the anonymous referees, for valuable comments.
74
Y. Zhou et al.
References 1. Arora, S., Safra, S.: Probabilistic checking of proofs: A new characterization of NP. Journal of the ACM 45(1), 70–122 (1998) 2. Average Case Complexity Forum, http://www.uncg.edu/mat/avg.html 3. Chow, S., Johnson, H., Gu, Y.X.: Tamper resistant software encoding, US Patent No. 6594761 (2003) 4. Chow, S., Gu, Y.X., Johnson, H., Zakharov, V.A.: An Approach to the Obfuscation of Control-Flow of Sequential Computer Programs. In: Davida, G.I., Frankel, Y. (eds.) ISC 2001. LNCS, vol. 2200, pp. 144–155. Springer, Heidelberg (2001) 5. Chow, S., Eisen, P., Johnson, H., van Oorschot, P.C.: White-Box Cryptography and an AES Implementation. In: Nyberg, K., Heys, H.M. (eds.) SAC 2002. LNCS, vol. 2595, Springer, Heidelberg (2003) 6. Collberg, C.S., Thomborson, C.: Watermarking, Tamper-Proofing, and Obfuscation - Tools for Software Protection. IEEE Trans. Software Eng. 28(6) (June 2002) 7. Chen, Y., Venkatesan, R., Cary, M., Pang, R., Sinha, S., Jakubowski, M.H.: Oblivious Hashing: A Stealthy Software Integrity Verification Primitive. In: Petitcolas, F.A.P. (ed.) IH 2002. LNCS, vol. 2578, pp. 400–414. Springer, Heidelberg (2003) 8. Electronic Colloquium on Computational Complexity, http://eccc.hpi-web.de/eccc/ ISSN 1433-8092 9. Garey, M.R., Johnson, D.S.: Computers and Intractability. W.H. Freeman and Company, New York (1979) 10. Kandanchatha, A.N., Zhou, Y.: System and method for obscuring bit-wise and two’s complement integer computations in software, Canadian patent application 2456644, 2004; US Patent Application 20050166191 (2005) 11. Klimov, A., Shamir, A.: Cryptographic Applications of T-functions. In: Matsui, M., Zuccherato, R.J. (eds.) SAC 2003. LNCS, vol. 3006, pp. 248–261. Springer, Heidelberg (2004) 12. Klimov, A.: Applications of T-functions in Cryptography, PhD Thesis, Weizmann Institute of Science (2004) 13. Mullen, G., Stevens, H.: Polynomial functions (mod m). Acta Mathematica Hungarica 44(3-4), 287–292 (1984) 14. van Oorschot, P.C.: Revisiting Software Protection. In: Boyd, C., Mao, W. (eds.) ISC 2003. LNCS, vol. 2851, pp. 1–13. Springer, Heidelberg (2003) 15. Rivest, R.L.: Permutation Polynomials Modulo 2w . Finite Fields and their Applications 7, 287–292 (2001) 16. Sander, T., Tschudin, C.F.: Towards Mobile Cryptography. In: Proceedings of the 1998 IEEE Symposium on Security and Privacy, pp. 215–224 (1998) 17. Sander, T., Tschudin, C.F.: Protecting Mobile Agents Against Malicious Hosts. In: Vigna, G. (ed.) Mobile Agents and Security. LNCS, vol. 1419, pp. 44–60. Springer, Heidelberg (1998) 18. Ogiso, T., Sakabe, Y., Soshi, M., Miyaji, A.: Software Tamper Resistance Based on the Difficulty of Interprocedural Analysis. In: Proceedings of WISA 2002 (2002) 19. Vuillemin, J.: Digital algebra and Circuits. In: Dershowitz, N. (ed.) Verification: Theory and Practice. LNCS, vol. 2772, Springer, Heidelberg (2004) 20. Warren Jr., H.S.: Hacker’s Delight. Addison-Wesley, Boston (2002), www.hackersdelight.org 21. Zhou, Y., Main, A.: Diversity via Code Transformations: A Solution for NGNA Renewable Security, The NCTA Technical PapersTM 2006, The National Cable and Telecommunications Association Show, Atlanta, pp. 173–182 (2006) 22. http://www.aacsla.com
Information Hiding in Software with Mixed Boolean-Arithmetic Transforms
A
Example of Key Hiding in an
MBA
75
Polynomial
Let the key be K = 0x87654321 (hex). We use three input variables x, x1 , x2 ∈ B32 , two linear mba identities 2y = −2(x ∨ (−y − 1)) − ((−2x − 1) ∨ (−2y − 1)) − 3; x + y = (x ⊕ y) − ((−2x − 1) ∨ (−2y − 1)) − 1; and one polynomial transform f (x) = 727318528x2 + 3506639707x + 6132886 ∈ P2 (Z/(232 )) to generate the following key code: a = x(x1 ∨ 3749240069); b = x((−2x1 − 1) ∨ 3203512843); d = ((235810187x + 281909696 − x2 ) ⊕ (2424056794 + x2 )); e = ((3823346922x + 3731147903 + 2x2 ) ∨ (3741821003 + 4294967294x2)); key = 4159134852e + 272908530a + 409362795x + 136454265b + 2284837645 + 415760384a2 + 415760384ab + 1247281152ax + 2816475136ad + 1478492160ae + 3325165568b2 + 2771124224bx + 1408237568bd + 2886729728be + 4156686336x2 + 4224712704xd + 70254592xe + 1428160512d2 + 1438646272de + 1428160512e2 + 135832444d, where a, b, d, e ∈ B32 are intermediate variables. The output value of key is always the constant K regardless of values in x, x1 , and x2 .
Geometrically Invariant Image Watermarking in the DWT Domain Shijun Xiang and Hyoung-Joong Kim Graduate School of Information Security, Center for Information Security Technologies (CIST), Korea University, Seoul 136-701, Korea [email protected], [email protected]
Abstract. Watermark resistance to both geometric attacks and lossy compressions is a fundamental issue in the image watermarking community. In this paper, we propose a DWT (Discrete Wavelet Transform) based watermarking scheme for such a challenging problem. Watermark resistance to geometric deformations is achieved by using the invariance of the histogram shape. In both theoretical analysis and experimental way, we show that the invariance can be extended to the DWT domain thanks to the time-frequency localization property of DWT. Consequently, we achieve the goal to embed a geometrically invariant watermark into the low-frequency sub-band of DWT in such a way that the watermark is not only invariant to various geometric transforms, but also robust to common image processing operations. Extensive simulation results demonstrate the superiority of the proposed watermark strategy due to the use of the histogram shape invariance combined with the DWT technique.
1
Introduction
With the development of the Internet and image processing techniques, more and more digital media (e.g., image and video) become available from online sites and easy to distribute illegal copy. Image watermarking as a potential technical solution has been developed for copyright protection of owner [1]. For identifying the illegal use of digital products, the watermark should be resistant to two different kinds of content-preserving manipulations. One is noise-like image processing operations, which may fail the watermark extraction by reducing the watermark energy. The other includes various geometric attacks. From the image watermarking point of view, geometric attacks mainly introduce synchronization errors between encoder and decoder. The watermark is still present, but the detector is no longer able to detect it. Thus, watermark robustness to geometric attacks is taken as a challenging issue. In the literature, only a few algorithms have presented the topic of how to achieve robustness against geometric attacks, which can be broadly classified into the following categories: S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 76–90, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
Geometrically Invariant Image Watermarking in the DWT Domain
77
i) Non-blind watermarking. The cost for resynchronization can be reduced by comparing the original image with the watermarked image which has undergone some geometric attacks, such as [2,3]. The non-blind watermarking schemes are limited for most of the practical applications. ii) Exhaustive search: Another obvious solution to desynchronization is to randomly search for the space including a set of acceptable attack parameters. One concern in the exhaustive search is the computational cost and the false alarm probability in the larger search space. iii) Invariant watermarking. Some researchers embedded the watermark into the affine invariant domain [4,5,6]. The watermark in the affine-invariant domains such as Fourier-Mellin transform can achieve the robustness to affine transforms. Though these techniques are robust to affine transform, they are vulnerable to cropping and difficult to implement. iv) Using reference mark. The authors in [7,8] embedded a template invariant to affine transform in the DFT domain. By searching for the template to estimate the attacked parameters for recovery of the watermark. The templatebased watermarking methods possibly suffer from the security issue [9]. v) Content-based invariance watermarking. By binding the watermark synchronization with the image characteristics, watermark detection avoids the synchronization errors. This class of watermarking methods usually exploit some special techniques to find those invariant features in an image. For instance, the watermark was embedded by using globally affine invariance of the moments [10,11,12] or locally invariant feature regions, such as image meshes [13] and Harris points [14]. A possible problem with content-based invariance watermarking is the computational burden in the detection due to the use of robust descriptor. In this paper, we propose a histogram-based invariant watermarking solution, which has a satisfactory robustness to various geometric attacks. The basic idea is to apply geometrically invariant property of the histogram shape in a way that the watermark is embedded by controlling the number of samples in each two neighboring bins. Considering those noise-like image processing operations, we extend the invariance of the histogram shape to the DWT domain, so that the watermark can be embedded in the low-frequency component to improve robustness performance. Stirmark [15] and Checkmark [16] based simulation tests demonstrate that the proposed watermarking algorithm is very robust to various geometric attacks and has a satisfactory performance for common image processing operations. The idea of modifying image statistics for watermarking is not new. For instance, patchwork-based watermarking methods (such as [17]) suppose that two sets of randomly selected pixels is Gaussian distributed with zero mean. The watermark sequence was embedded by shifting the mean among groups of two sets of pixels. The patchwork method is sensitive to geometric attacks since the detector can not find the patches correctly. In [18], a histogram specification was introduced for image watermarking. Later, the authors further presented some works to improve the watermarking method [19]. In [18,19], the watermark is in
78
S. Xiang and H.-J. Kim
fact designed as a histogram. The basic idea is that the pixels in the original image are regrouped to achieve a desired histogram (the watermarked histogram). Also, the histogram has been applied in other watermarking applications such as lossless data hiding [20]. Our proposed watermarking method is based on the histogram shape invariance, which is distinctively different from the previous histogram-based ones. In the next section, we will describe the characteristics of the histogram shape to geometric transforms via both the theoretical proof and the extensive testing. This is followed by a description of our proposed watermark embedding and detecting strategy. We then test and analyze the watermark robustness to geometric distortions and some common attacks. Finally, we draw the conclusions.
2
Invariant Features to Geometric Transformations
In this section, we first review geometric transformations briefly. We then investigate the insensitivity of the image histogram shape in the low-frequency of the DWT domain to geometric attacks. 2.1
Geometric Transformations
In [21], the authors introduced various geometric attacks on image watermarking system in detail. Common geometric distortions include affine transforms (such as rotation, scaling, translation and shearing) and cropping. RBAs (Random Bending Attacks) are recently reported geometric attacks, which are claimed as a challenging issue in image watermarking [21]. From the image watermarking point of view, all of the geometric attacks respect the rule that some or all of the pixels are displaced at a random amount under the constraint of visual coherence. Thus, geometric attacks mainly introduce desynchronization between encoder and decoder. Due to interpolation during geometric deformations, the pixel values will be modified slightly. Towards this direction, in the next section we investigate an invariant statistical feature (the histogram shape), which is mathematically invariant to affine transforms, as well as statistically resistant to those challenging geometric attacks, such as cropping and RBAs. 2.2
Invariance of the Histogram shape in the Spatial Domain
In [22], we analyzed the time-scale invariance of the histogram shape in 1-D audio signal. However, the image is quite different from the audio signal and the attacks the images may encounter are also different. In [23], we have addressed image histogram shape invariance in the spatial domain. There paper extends the invariance to the DWT domain for improving the robustness. The image histogram with equal-sized bins may be described by HM = {hM (i) i = 1, · · · , L},
(1)
Geometrically Invariant Image Watermarking in the DWT Domain
79
where HM is a vector denoting the gray-level histogram of the image F (i, j) = {f (i, j) i = 1, .., R, i = 1, .., C}, and hM (i), hM (i) ≥ 0 denotes the number of 0002L pixels in the ith bin satisfying i=1 hM (i) = R × C. Suppose that the resolution of the image is P bits, the relation between the number of bins L and the bin width M is calculated as 0003 if M od(2P /M ) = 0 2P /M (2) L= P 00032 /M 0004 + 1 otherwise, where 00030004 is the floor function. Consider the case of pure non-proportional scaling over the image F (i, j). Suppose that F 0003 (x0003 , y 0003 ) = {f 0003 (x0003 , y 0003 )} is the scaled image with the scaling factors α and β in both vertical and horizontal directions. f 0003 (x0003 , y 0003 ) is the value in the point (x0003 , y 0003 ), theoretically satisfying the expression f 0003 (x0003 , y 0003 ) = f (x/α, y/β). In the new version, the number of rows and columns are calculated as R0003 = αR and C 0003 = βC. The histogram of F 0003 (x0003 , y 0003 ) can be formulated as 0003 HM = {h0003M (i) i = 1, · · · , L},
(3)
which satisfies the expression h0003M (i) = hM (i)·α·β in theory, referred to Equation (1). Equation (3) indicates the invariance of the histogram shape to the scaling operation because under the scaling the number of elements in the bins are modified linearly. In practice, the number of the pixels in each bin may be slightly modified due to interpolation. Rotation and translation are two common operations, which will be able to modify the pixel positions in the image plane. In the two cases, the histogram shape will be invariant due to the fact that the histogram is independent of the pixel position. This kind of special property also provides the histogram shape a capability against other challenging geometric attacks, such as cropping and RBAs. In practice, the interpolation errors during rotation and RBAs will distort the pixel slightly. As a result, the histogram shape will be changed a little. 2.3
The Histogram Shape Invariance in the DWT Domain
As we have known, the watermark can be embedded in the low-frequency component to improve watermark robustness performance, referred to the previous watermarking schemes in the DCT [1] and DWT [8] domain. Towards this direction, we extend the invariance of the histogram shape to the low-frequency subband of the DWT domain by exploiting the time-frequency localization property of DWT. The detailed procedure is described below. Assume that F (x/α, y/β) is the scaled version of a digital image F (x, y). The pixels in F (x, y) and F (x/α, y/β) can be interpreted as the following 1-D vectors by using zigzag scanning, respectively: V1 = {v1 (i) i = 1, · · · , R × C} and V2 = {v2 (j) j = 1, · · · , α · R × β · C}, (4) where R and C respectively denote the size of rows and columns in F (x, y). Let {ckα,β (j)} and {ck (i)} denote the low-frequency sub-band coefficients of V1 and
80
S. Xiang and H.-J. Kim
V2 after a k-level 1-D DWT, respectively. When k is 0, it is equivalent in the spatial domain. If k is 1, we have: 0003 0002L 1 ck (i) = c1 (i) = l=1 g(l) · v1 (2(i − 1) + l) 0002 1 (5) ckα,β (j) = c1α,β (j) = L l=1 g(l) · v2 (2(j − 1) + l), where {g(l) l = 1, · · · , L1 } is the low-pass filter. The DWT operation above will make the histogram difference between V1 and V2 in the DWT domain though the theoretical invariance to the scaling is right in the spatial domain. It is due to the fact that some of the neighboring samples belong to different bins. This problem can be solved by regrouping the samples in V1 and V2 according to their spatial-domain histogram shape, and then applying the DWT to the pixels of each bin. Referred to Equation (5), the time-frequency localization characteristic of DWT shows that each transformed coefficient in the lowfrequency sub-band is generated by filtering the neighboring L1 samples with the low-pass filter {g(i)}. As a result, the DWT coefficients obey the same the histogram shape as the spatial-domain one. Let us take a simple example. Suppose that there are 6 neighboring samples, their intensity values denoted by a vector X = [1, 6, 2, 5, 3, 4]. We extract the histogram with 3 equal-sized bins. By regrouping X to generate a new vector Xs = [1, 2, 3, 4, 5, 6]. Then, to perform 1-level DWT on X and Xs with the 0003 db10003 wavelet base with the low-pass filter of √ √ [ 22 , 22 ]. The low-frequency coefficients of X and Xs are respectively computed and denoted as C = [(1 + 6) × Cs = [(1 + 2) ×
√
√ √ 2 2 2 , (2 + 5) × , (3 + 4) × ] √2 √2 √2 2 2 2 2 , (3 + 4) × 2 , (5 + 6) × 2 ]
√
√
√
= [ 7 2 2 , 7 2 2 , 7 2 2 ], √ √ √ = [ 3 2 2 , 7 2 2 , 112 2 ].
Obviously, there are 3 low-frequency coefficients after 1-level DWT in C and Cs . Without loss of generality, we compute the histograms of X, Xs , C and Cs with 3 equal-sized bins, respectively. It is noted that X or Xs (three bins, each bin including two samples) and Cs (three bins, each bin including one coefficient) have the same histogram shape, but the histogram of C (only one bin including 3 coefficients) is completely different from that of X (three bins, each bin including two samples). The simple example demonstrates that mapping image into 1-D signal and then regrouping are two crucial steps to exploit the time-frequency localization characteristics of 1-D DWT, so that the invariance of the histogram shape can be transferred to the DWT domain. Note that the invariance can not be equally extended to the DFT and DCT global transform domain. Let Hμr,k denote the histogram of the image after the k-level DWT being performed. Here, μ is the bin width. Referred to Equation (1), the image hisr,0 = {hr,0 togram after regrouping may be rewritten as HM M (i) i = 1, · · · , L}, and r,0 HM = HM . Referred to Equation (5)), the low-frequency band histogram shape of 1-level DWT domain may be mathematically formulated as, r,1 Hμr,k = H(χ) 1 ·M = {0003
hr,0 hM (i) + L1 − 1 M (i) + L1 − 1 0004} = {0003 0004}, 2 2
(6)
Geometrically Invariant Image Watermarking in the DWT Domain
81
r,0 where i = 1, · · · , L, and the bin width is changed from M in HM to χ · M in r,1 Hχ·M , theoretically. If k > 1, according to the definition of k-level DWT and mathematical induction, Hμr,k may be formulated as,
Hμr,k
=
Hχr,k k ·M
= {0003
(i) + L1 − 1 hr,k−1 χk−1 ·M 2
0004} ≈ {
(i) hr,k−1 χk−1 ·M 2
}≈{
hM (i) }. 2k
(7)
Referred to Equations (5), (6) and (7), after k-level DWT the bin width in the low-frequency sub-band is χk · M instead of M in the spatial domain. Here, χ is χ=
L1 0004
g(l)
(8)
l=1
according to 1-D DWT theory. Corresponding to Equation (7), it is obvious that Hχr,k k ·M keeps the same the histogram shape as the spatial-domain histogram HM since the number of samples is almost linearly reduced. The above proof process shows the spatial-domain histogram shape invariance and its extension in the DWT domain by: i) mapping image into 1-D signal, ii) regrouping the 1-D signal and iii) afterwards performing the DWT. Due to the fact that the zigzag mapped and regrouped 1-D signal can keep part of the original spectrum of image, the watermark based on the DWT-domain histogram shape can achieve a better performance for those common image processing operations while keeping its capability against geometric attacks. 2.4
Experimental Testing
In practice, the histogram shape is not invariant exactly due to interpolation errors during geometric deformations, such as scaling, rotation and many others. In order to tolerate the interpolation errors, the spatial-domain histogram bin width should be not less than 2 according to our observations. In order to investigate the effect of interpolation during geometric attacks, we examine the histogram shape invariance by computing the relative relations of each two neighboring bins with peppers of size 512 × 512 as the example image. We also have tested the other well-known benchmark image (such as lena and baboon, etc.). The simulation results are similar. In order to satisfy the condition that the bins hold sufficient samples, we extract the histogram by referring to the mean value. For an image, we compute the histogram by the following steps: i) Remove those non-zero value pixels. By doing this step, we can avoid the effect of those resulted zero-value samples during some geometric deformations (e.g., rotation and shearing). ii) Extract the histogram by referring to the mean value and a parameter λ, ¯ (1 + λ)A]. ¯ When λ = 0.6, we have which can be formulated as B = [(1 − λ)A, achieve a satisfactory result. iii) Finally, the histogram is extracted from the low-frequency sub-band of the 2-level DWT with the wavelet base of 0003 db50003 .
82
S. Xiang and H.-J. Kim
Fig. 1. P eppers and its deformed versions with 9 typical geometric transforms
In order to evaluate the histogram shape invariance in the DWT domain, we implement 9 typical geometric attacks including scaling, rotation, shearing, bilinear, bilinear curved, warping, global bending and high-frequency bending as shown in Fig. 1(b)∼(j), as well as three different percentages of cropping. The relative relations of each two neighboring bins in the number of samples are designed to represent the histogram shape. By computing the alteration of the relative relations, we can examine the resistance of the histogram shape to various geometric attacks. The relative relation of two successive bins can be formulated as α(k) =
hM (k + 1) , 1 ≤ k ≤ L − 1, hM (k)
(9)
where L is the number of the bins, and M is the bin width, referred to Equation (8). Fig. 2 (a), (b) and (c) illustrate the relative relations of each two neighboring bins under the 9 typical geometric deformations. Fig. 2 (d) shows the effect of cropping of 5%, 15% and 25% by deleting the outer content, respectively. We can see that the relative relations in the number of pixels (or DWT coefficients) among groups of two neighboring bins are rather stable under these geometric attacks. The above experimental works show the previous theoretical analysis on the spatial-domain histogram shape insensitivity to geometric attacks and its extension in the DWT domain in Section 2 is logical. This also implies a fact that if we embed the watermark based on the relative relations, it is expected that the watermark will be able to withstand those challenging geometric attacks. In addition, since the histogram is extracted from the low-frequency DWT coefficients of the preprocessed images, the watermark also can have a good capability against those common compressing and filtering operations.
Geometrically Invariant Image Watermarking in the DWT Domain
1 0
4
Scaling of 80%
Bilinear α(k)
α(k)
2
0
5
10
15 nBINs
20
25
4 α(k)
α(k) 0
5
10
15 nBINs
20
25
10
15 nBINs
20
25
15 nBINs
20
0
0
5
10
15 nBINs
20
0
5
10
15 nBINs
20
2
α(k)
α(k) 10
15 nBINs
25
0
0
5
10
15 nBINs
20
25
0
5
10
15 nBINs
20
2
High−frequency Bending
1
10
0
5
10
15 nBINs
20
2
α(k)
α(k) 0
5
10
15 nBINs
20
25
30
(c) Global bending, High-frequency bending and Jittering
25
30
Cropped 15%
25
30
Cropped 25%
Jittering Attack
0
30
1 0
30
5
25
1 0
30
α(k)
α(k)
2
20
30
Cropped 5%
Global Bending 2
5
25
(b) Bilinear, Curved and Warping
4
0
30
5
30
(a) Rotation, Scaling and Shearing
0
25
Bilinear−Curved
Warping α(k)
α(k)
Shearing
5
10
10
2
0
5
1 0
30
4
0
0
2
o
Rotation of 20
0
2 0
30
2
83
1 0
0
5
10
15 nBINs
20
25
30
(d) Cropping of 5%,15% and 25%
Fig. 2. The effect of the geometric deformations and cropping on the histogram shape. This is evaluated by computing the relative relations of each two neighboring bins in the number of DWT coefficients, denoted by α(k). The bin relative relations of the original image P eppers are plotted with the points ‘+’ for the purpose of comparing with the attacked ones plotted with ‘o’. These results demonstrate the histogram shape invariance and the effect of the interpolation on the invariance.
3
Proposed Watermarking Algorithm
In this section, a geometrically invariant multi-bit image watermarking is presented. Watermark insertion and recovery are designed by the use of the histogram shape invariance. The histogram is extracted from the low-frequency sub-band of DWT for the purpose of improving robustness to those common image signal processing manipulations by preprocessing image (including mapping into 1-D signal and regrouping).
84
S. Xiang and H.-J. Kim
I L Original image
Preprocessing B and DWT
Extracting histogram
H
W
Embedding
Multi-bit watermark
Ȝ T Yes Watermarked Image
No
Watermark visible ?
Modify: T
IW
IDWT
Fig. 3. Block Diagram of Watermark Embedding
3.1
Watermark Insertion
In the embedding phase, the image is first mapped into 1-D style, and then regrouped (keeping the index as 0006). Furthermore, we extract the histogram as in the previous section (see Section 2.4). The k-level DWT is performed to extract the low-frequency coefficients. Divide the bins as groups, each two neighboring bins as a group is used to carry a bit of watermark information by reassigning the number of coefficients in each two bins. Finally, the inverse-DWT is performed to generate the watermarked image with the index 0006. Fig. 3 illustrates block diagram of the embedding process. Suppose that there is a binary sequence W = {wi i = 1, · · · , Lw } to be hidden into a digital image. The histogram in the low-frequency sub-band of DWT is denoted by H = {hM (i) i = 1, · · · , L}. L should be not less than 2Lw in order to embed all bits. Let Bin 1 and Bin 2 be two consecutive bins, which include a and b coefficients, respectively. We control the relative relation of the two bins to embed one bit of information, formulated as 0003 a/b ≥ T if w(i) = 1 (10) b/a ≥ T if w(i) = 0 where T is a threshold selected with the consideration of the watermark robustness performance and the embedding distortion. How to embed one bit into two neighboring bins is depicted in Fig. 4. Let us consider the case that w(i) is 0003 10003 . If a/b ≥ T , no operation is needed. Otherwise, the number of samples in two bins, a and b, will be adjusted until satisfying a1 /b1 ≥ T for the insertion of the bit 0003 10003 . In the case of embedding the bit 0003 00003 , the procedure is similar. The watermark rules are referred to Equations (11) and (12). If w(i) is 0003 10003 and a/b < T , some randomly selected samples from Bin 2, in the number denoted by I2 , will be modified to fall into Bin 1, achieving a1 /b1 ≥ T .
Geometrically Invariant Image Watermarking in the DWT Domain
85
Fig. 4. Illustration of embedding one bit of watermark information. There are four cases in total: (a) a > b and a/b < T , (b) a > b and a/b ≥ T , (c) b > a and b/a < T , and (d) b < a and b/a ≥ T . I1 and I0 are the numbers of the least modified coefficients according to the watermarking embedding rule.
If w(i) is 0003 00003 and b/a < T , some selected coefficients from Bin 1, in the number denoted by I1 , are adjusted into Bin 2 , satisfying b0 /a0 ≥ T . The rule for reassigning the coefficients is described as Equation (11). 0003 f10003 (i) = f1 (i) + M, 1 ≤ i ≤ I1 (11) f20003 (j) = f2 (j) − M, 1 ≤ j ≤ I2 where M is the bin width. f1 (i) and f2 (j) denote the ith and j th modified coefficient in Bin 1 and Bin 2, respectively. The modified coefficients f10003 (i) fall into Bin 2 while f20003 (i) move to Bin 1. I1 and I2 can be computed by using the following mathematical expressions, 0003 I1 = (T ∗ b − a)/(1 + T ) making a1 /b1 ≥ T f rom a/b ≤ T (12) I2 = (T ∗ a − b)/(1 + T ) making b0 /a0 ≥ T f rom b/a ≤ T, where a1 = a + I1 , b1 = b − I1 , a0 = a − I2 , and b0 = b + I2 . This procedure is repeated until all watermark bits are embedded. Finally, the inverse-DWT is implemented to generate the watermarked image, denoted by Fw (x, y) = {fw (x, y) x = 1 ∼ R, y = 1 ∼ C}. 3.2
Watermark Recovery
Our goal is to get an estimate of hidden bits, W 0003 = {w0003 (i) i = 1, · · · , Lw }, at a low error rate. The histogram extracted from the low-frequency coefficients in Fw0003 (x, y) is extracted as in the process of watermark insertion. Suppose that the
86
S. Xiang and H.-J. Kim
number of coefficients in two consecutive bins are a0003 and b0003 . By comparing their values, we will be able to extract the hidden bit, formulated as 0003 1 if a0003 /b0003 ≥ 1 0003 w (i) = (13) 0 otherwise. The same process is repeated to obtain all the hidden bits by computing all groups of two neighboring bins. In the watermarking decoder, the parameters, Lw , λ, are beforehand known. Thus, this is a blind watermarking algorithm.
4
Experimental Results
Below, we conduct experiments to demonstrate the performance of the proposed multi-bit watermarking scheme. The watermark is embedded in the lowfrequency subband of 2-level DWT. The wavelet base 0003 db50003 is adopted. In the experiments, a 10-bit watermark was embedded into a set of 512 × 512 example images (five of them in total, including Boat, Peppers, Baboon and Lena and Clock ) using the proposed algorithm. The parameter λ = 0.6 is selected to compute the histogram with 20 bins. The embedding threshold, T , is 1.5. We assign 1 = 2 = 6% for the reliable watermark recovery. 4.1
Imperceptibility
In the embedding, only a small percent of pixels are modified with a smaller distortion magnitude (less than 6 gray levels). Since the probability of those selected samples being added or reduced is approximately equal, the watermark hardly has an effect on the image mean. The PSNR and mean values are tabulated in Table 1. Aˆo and Aˆw denote the mean before and after watermarking, respectively. For the five example images, the PSNR is over 40 dB. Usually, we consider that the watermark being embedded in the low-frequency sub-band will be able to cause the visual distortion. In order to better evaluate the watermark distortion, we plot the example images Lena and P eppers and their watermarked versions, as shown in Fig. 5. We can see that though the watermark is embedded into the low-frequency component, the distortion is imperceptible. The basic reason is that the distortion on the low-frequency coefficients has been evenly spread over times of pixels. 4.2
Robustness
Table 2 shows a list of attacks used to distort the images and their effects on the watermark in the experiments. In case of Stirmark attacks, the watermark can resist the cropping of 50%, JPEG compression of the quality factor 60, and all attacks regarding Affine Transform, Rotation, Rotation+Scale, Rotation+Cropping, and Median filtering.
Geometrically Invariant Image Watermarking in the DWT Domain
87
Table 1. The distortions caused by the watermark in the spatial (denoted by 0002 S 0002 ) and DWT Domain (denoted by 0002 D0002 ) (In PSNR and Mean) Features ˆo ) Mean (A ˆw ) Mean (A PSNR (dB)
Boat (S/D) 117.53/235.06 117.51/235.03 44.64/44.12
Peppers (S/D) Baboon (S/D) 119.83/239.65 98.84/197.68 119.63/239.26 98.65/197.37 45.02/44.95 46.40/46.40
Clock (S/D) 70.48/140.88 70.58/140.95 50.23/50.17
Lena (S/D) 115.14/230.29 115.07/230.15 44.99/44.98
Table 2. Watermarking robustness to geometric attacks and common signal processing operations in the low-frequency subbabd of the DWT domain Strimarm4.0 AFFINE (all) CROP (50,75) ROT (all) ROTSCALE (all) ROTCROP (all) MEDIAN (all) RML (all) JPEG (15-50) JPEG (60-100)
BER 0 0 0 0 0 0 0 0.1 0
Checkmark Linear Transform Aspect Ratio Projective Warping Shearing Row/Column Removal Down/Up Sampling JPEG2000 (0.5 ∼ 8.0 bpp) JPEG2000 (0.2 ∼ 0.4 bpp)
BER 0 0 0 0 0 0 0 0 0.1
The Checkmark benchmark is one of the second generation watermarking evaluation tools, which has incorporated some common but challenging geometric deformation operations, such as RBAs based on projective transform and image warping. Therefore, here we adopt the Checkmark for measuring the watermark performance to those challenging geometric attacks. In Table 2, the attack parameters are respectively given as: Linear Transform (T11 = −0.85; T12 = −0.2; T21 = −0.05; T22 = 1.3), Aspect Ratio (xscale=0.8 yscale=1.1), projective (30 degree rotation along x-axis + perspective projection), Warping (warpfactor =12), Shearing (xshear=5% yshear=5%), and Row/Column removal (12 rows and 9 columns removed) are implemented, respectively. In addition, the watermark is also robust to Down/Up Sampling and JPEG2000 (with 0.5 bpp, compression rate of 16:1). We can see that the watermark also has a satisfactory robustness to those challenging geometric attacks. Overall, in both the Stirmark and the Checkmark tests we show that in the low-frequency sub-band of 2-level DWT, the watermark has a satisfactory robustness to geometric attacks (even those challenging geometric attacks, such as cropping and RBAs) and common image processing operations (e.g., median filter, JPEG and JPEG2000 compressions). We believe that it is due to two main aspects: (i) The histogram-based watermark is insensitive to the shifting of the pixels on the image plane since the histogram is independent of the pixel position. As a result, the watermark is insensitive to those challenging geometric attacks and (ii) in the low-frequency sub-band of DWT, the watermark has a potential capability against those noise-like image processing operations.
88
S. Xiang and H.-J. Kim
Fig. 5. The original example images and the marked ones (resolution of 44%): (a) Lena (F ), (b) Peppers, (c) Lena after watermarking, (d) Peppers after watermarking
5
Concluding Remarks
In both the theoretical analysis and the extensive experiments, in this paper we show that i) the histogram shape is insensitive to various geometric deformations, and ii) the histogram shape invariance property can be extended to the lowfrequency sub-band of the DWT domain. Accordingly, a geometrically invariant image watermarking scheme is successfully designed by using the histogram shape invariance in the DWT domain. Extensive experimental works have shown that the watermark has a satisfactory robustness performance for those challenging geometric attacks (e.g., projective transform, image warping and random
Geometrically Invariant Image Watermarking in the DWT Domain
89
cropping). The watermark also achieves good robustness against those common image processing operations, such as JPEG compression, JPEG2000 compression, median filters, etc. In theory, the watermark’s embedding capacity is at most 128 bits. In practice, the watermark can achieve a satisfactory robustness only when the embedding capacity is less than 20 bits according to our observations. The more the embedding capacity, the lower the watermark robustness. Thus, how to improve data embedding capacity is a consideration of future works. In addition, our proposed watermarking scheme is suffering from its performance limitation for the image of smaller size since a smaller image can not provide efficient samples to satisfy the condition that the bins hold enough samples. Another note is that the proposed watermarking algorithm is not working well for those high-contrast images, which usually have a distinctive histogram. Since the algorithm is very robust to geometric deformations under the constraint of visual coherence, we consider that our proposed watermarking scheme can combine with other robust watermarking algorithms together for better resisting those challenging geometric attacks and common image signal processing operations.
Acknowledgement This work was supported by the Second Brain Korea 21 Project, and the MIC (Ministry of Information and Communication), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA (Institute of Information Technology Advancement)” (IITA-2006-(C1090-06030025)).
References 1. Cox, I.J., Kilian, J., Leighton, T., Shamoon, T.: Secure Spread Spectrum Watermarking for Multimedia. IEEE Transaction on Image Proccess 6(6), 1673–1687 (1997) 2. Johnson, N., Duric, Z., Jajodia, S.: Recovery of Watermarks from Distorted Images. In: Pfitzmann, A. (ed.) IH 1999. LNCS, vol. 1768, pp. 318–332. Springer, Heidelberg (2000) 3. Davoine, F.: Triangular Meshes: A Solution to Resist to Geometric Distortions Based Watermark-Removal Softwares. In: Proc. Eurasip. Signal Processing Conference, vol. 3, pp. 493–496 (2000) 4. Ruanaidh, J., Pun, T.: Rotation, Scale and Translation Invariant Spread Spectrum Digital Image Watermarking. Signal Processing 66(3), 303–317 (1998) 5. Lin, C.Y., Wu, M., Bloom, J., Miller, M., Cox, I., Lui, Y.M.: Rotation, Scale, and Translation Resilient Public Watermarking for Images. IEEE Transaction on Image Processing 10(5), 767–782 (2001) 6. Zheng, D., Zhao, J., Saddik, A.: RST-Invariant Digital Image Watermarking Based on Log-Polar Mapping And Phase Correlation. IEEE Transactions on Circuits and Systems for Video Technology 13(8), 753–765 (2003)
90
S. Xiang and H.-J. Kim
7. Pereira, S., Pun, T.: Robust Template Matching for Affine Resistant Image Watermarks. IEEE Transaction on Image Processing 9(6), 1123–1129 (2000) 8. Kang, X., Huang, J., Shi, Y., Lin, Y.: A DWT-DFT Composite Watermarking scheme Robust to Both Affine Transform and JPEG Compression. IEEE Transactions on Circuits and Systems for Video Technology 13(8), 776–786 (2003) 9. Herrigel, A., Voloshynovskiy, S., Rytsar, Y.: The Watermark Template Attack. In: Proc. SPIE Electronic Imaging (2001) 10. Kim, H.S., Lee, H.K.: Invariant Image Watermark Using Zernike Moments. IEEE Transaction on Circuits and Systems for Video Technology 13(8), 766–775 (2003) 11. Alghoniemy, M., Tewfik, A.H.: Geometric Invariance in Image Watermarking. IEEE Transactions on Image Processing 13(2), 145–153 (2004) 12. Dong, P., Brankov, J.G., Galatsanos, N.P., Yang, Y.Y., Davoine, F.: Affine Transformation Resistant Watermarking Based on Image Normalizaton. IEEE Transactions on Image Processing 14(12), 2140–2150 (2005) 13. Lu, C.S., Sun, S.W., Hsu, C.Y., Chang, P.C.: Media Hash-Dependent Image Watermarking Resilient Against Both Geometric Attacks and Estimation Attacks Based on False Positive-Oriented Detection. IEEE Transactions on Multimedia 8(4), 668– 685 (2006) 14. Seo, J.S., Yoo, C.D.: Image Watermarking Based on Invariant Regions of ScaleSpace Representation. IEEE Transactions on Sginal Processing 54(4), 1537–1549 (2006) 15. http://www.cl.cam.ac.uk/∼ fapp2/watermarking/stirmark/ 16. http://watermarking.unige.ch/Checkmark/ 17. Yeo, I.K., Kim, H.J.: Generalized Patchwork Algorithm for Image Watermarking. Multimedia System 9(3), 261–265 (2003) 18. Coltuc, D., Bolon, P.: Watermarking by Histogram Specification. In: Proc. SPIE International Conference on Security and Watermarking of Multimedia Contents II, pp. 252–263 (1999) 19. Coltuc, D., Bolon, P., Chassery, J.M.: Fragile and Robust Watermarking by Histogram Specification. In: Proc. SPIE International Conference Security and Watermarking of Multimedia Contents IV, vol. 4675, pp. 701–710 (2002) 20. Lee, S., Suh, Y., Ho, Y.: Lossless Data Hiding Based on Histogram Modification of Difference Images. In: Aizawa, K., Nakamura, Y., Satoh, S. (eds.) PCM 2004. LNCS, vol. 3333, pp. 340–347. Springer, Heidelberg (2004) 21. Licks, V., Jordan, R.: Geometric Attacks on Image Watermarking Systems. IEEE Multimedia 12(3), 68–78 (2005) 22. Xiang, S., Huang, J., Yang, R.: Time-scale Invariant Audio Watermarking Based on the Statistical Features in Time Domain. In: Proc. of Information Hiding workshop (June 2006) 23. Xiang, S., Kim, H.J.: An Invariant Image Watermarking Against Geometric Attacks. In: Proc. International Joint Workshop on Information Security and Its Applications (February 2007)
Implementation of LSM-Based RBAC Module for Embedded System Jae-Deok Lim1 , Sung-Kyong Un1 , Jeong-Nyeo Kim1 , and ChoelHoon Lee2 1
2
Electronics and Telecommunications Research Institute (ETRI) 161 Gajung-dong, Yuseong-gu, Daejeon 305-700, Korea {jdscol92,skun,jnkim}@etri.re.kr Computer Science & Engineering, Chungnam National University 220 Gung-dong, Yuseong-gu, Daejeon 305-764, Korea [email protected]
Abstract. Security requirements of the embedded system which were not considered when the embedded system is independently deployed are being increased because the embedded system is connected to an internet. Accordingly, the coverage of the system security is being expanded from the general server to the embedded system. And it is not enough that the embedded system supports only its inherent functions and it becomes the essential element to provide the security function to the embedded system. This paper implements the Role Based Access Control(RBAC) module which is designed using the Linux Security Module(LSM) for the embedded system. RBAC allows security management to be administrated easily and LSM is a lightweight, general purpose, access control framework for mainstream Linux kernel that enables many different access control models. The combination of RABC and LSM properties is very suitable for one of security solutions of embedded system because of the simplicity and flexibility of RBAC and a lightweight loadable mechanism of LSM. And we show the performance of our implementation that has very small overhead for the intended processing and is acceptable.
1
Introduction
The embedded system market is gradually enlarged to the request increment of the terminal performing the specific purpose. Particularly, the network-based service is generalized around the mobile personal digital assistants and the demand about the network service is more increased as the personal digital assistants have the high spec and high performance including the Smart phone, Ultra mobile PC(UMPC), Portable Multimedia Player(PMP), and etc. The usage of the internet service heightens the possibility of the external invasion and the sensitive personal data including the bank account, user certificates, personal address book, record of utilization, and etc. can be taken out stored in small and medium embedded system such as the personal digital assistant or the settop box, and etc. The embedded system using the internet service needs the security requirement which was not considered when being employed excluding network function. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 91–101, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
92
J.-D. Lim et al.
In fact the Fsecure Mobile Anti-virus site reports that the total number of mobile malware rose up to 344 by the end of December 2006[1]. Particularly F-Secure’s research lab received samples of new mobile spying tools, running on Windows Mobile and Symbian S60 3rd Edition devices in May, 2007. According to the classification of a malware, the Trojan horse type occupied 76.6% to a most and the worm type occupied about 15.8%. Some of actually discovered viruses from [1] are as follows. Doomboot.A disguises as the crack file of the doom2 which is the game for the cellular phone and is distributed widely. The phone operation is interrupted in an infection and it has the information stored in the terminal and the concern losing data in a recovery. P bstealer.A disguises as the useful utility and infects cellular phone. And it leaks the phone book data information of the corresponding terminal. Sendtool.A installs some of Trojan horse codes pretending to the useful program. And it is propagated through the bluetooth. To correspond to the threats as in the above, it is needed that the security function for protecting data and operating environment as well as the original function of embedded system, especially access control function. This paper tries to strengthen the security of the embedded system by applying the Role Based Access Control(RBAC) function to the embedded linux to be used as OS of the embedded system and the role-based access control is implemented based on the Linux Security Module(LSM) framework and then able to be easily applied to the embedded system. In this paper, it is called L-RBAC that the implementation of role-based access control model using LSM framework. RBAC is suitable to the access control model for an embedded system because of being the model which is more advantageous than the other access control model such as Discretionary Access Control (DAC) and Mandatory Access Control(MAC) as to the management and use[2][3]. DAC is an access policy that restricts access to files(and other system objects such as directories, devices, and IPC) based on the identity of users and/or the groups to which they belong. DAC is commonly implemented by Access Control List(ACL) and also is suitable for small system which has little users and resources. But DAC has the security hole that its policy can be modified and ignored without any restriction by user who has the super user right, that is ’root’ user in linux system. Most malwares can acquire the super user right with various ways for example buffer-over-flow. RBAC provides the role based and strengthened access control function and is one of the access control models which is applied in order to intensify the security of the system. In RBAC, access to object by a subject is determined by role not the identity of subject. And it is the access control model providing the convenience of the management and many flexibilities unlike DAC, MAC, and etc. to the policy manager. A role is defined as the job function about the various tasks. Users are assigned to roles, permissions are assigned to roles and users acquire permissions by being members of roles. The LSM framework meets the goal of enabling many different security models with the same base Linux kernel while minimally impacting the Linux kernel[4]. The generality of LSM permits enhanced access controls to be effectively
Implementation of LSM-Based RBAC Module for Embedded System
93
implemented without requiring kernel patches. LSM also permits the existing security functionality of POSIX.1e capabilities to be cleanly separated from the base kernel. This allows users with specialized needs, such as embedded system developers, to reduce security features to a minimum for performance. So the combination of RABC and LSM properties is very suitable for one of security solutions of embedded system. Of cause LSM-based RBAC can be also applied to the general linux system such as servers and desktops. This paper shows that the LSM-based RBAC model is suitable for the embedded system and can be one solution for securing the embedded system. The LSM is a lightweight, general purpose, access control framework for the mainstream Linux kernel that enables many different access control models to be implemented as loadable kernel modules. And a number of existing enhanced access control implementations have already been adapted to use the LSM framework, for example, POSIX.1e capabilities[5], SELinux[6], and Domain and Type Enforcement (DTE)[7]. Although SELinux supports RBAC, it was mainly designed for a desktop or a server. And it is not suitable for the embedded system because it has the difficulty of an operation and setting up and is the use of the concept complicated with the Type Enforcement, the MultiLevel Security, and etc. The rest of this paper is organized as follows. A brief description of standard of RBAC is contained in section 2. Section 3 describes the design and implementation of L-RBAC module. Section 4 describes the performance overhead of L-RBAC. Section 5 shows the example of L-RBAC application. Finally conclusions and future work are discussed in section 6.
2
Standard of Role-Based Access Control
The standard for RBAC is proposed by NIST to have unified ideas from prior RBAC models, commercial products, and research prototypes[8]. It is intended to serve as a foundation for product development, evaluation, and procurement specification. The RBAC reference model from [8] is defined in terms of four model components – Core RBAC, Hierarchical RBAC, Static Separation of Duty Relations, and Dynamic Separation of Duty Relations. Core RBAC defines a minimum collection of RBAC elements, element sets, and relations in order to completely achieve a Role-Based Access Control system. This includes user-role assignment and permission-role assignment relations, considered fundamental in any RBAC system. In addition, Core RBAC introduces the concept of role activation as part of a user’s session within a computer system. Core RBAC is required in any RBAC system, but the other components are independent of each other and may be implemented separately. The Hierarchical RBAC component adds relations for supporting role hierarchies. A hierarchy is mathematically a partial order defining a seniority relation between roles, whereby senior roles acquire the permissions of their juniors and junior roles acquire users of their seniors.
94
J.-D. Lim et al.
Separations of duty relations are used to enforce conflict of interest policies. Conflict of interest in a role-based access control system may arise as a result of a user gaining authorization for permissions associated with conflicting roles. Static Separation of Duty(SSD) Relations adds exclusivity relations among roles with respect to user assignments. A common example is that of Static Separation of Duty (SSD) that defines mutually disjoint user assignments with respect to sets of roles. Dynamic Separation of Duty(DSD) Relations defines exclusivity relations with respect to roles that are activated as part of a user’s session. DSD properties provide extended support for the principle of least privilege in that each user has different levels of permission at different times, depending on the role being performed. In addition to [8], there is RBAC implementation standard of draft version 0.1[9]. The draft describes the requirement of RBAC implementation and requires that Administrative Commands and Supporting System Functions for Core RBAC must be provided at least. In this paper, we comply with NIST RBAC standard and requirements[8][9]; that is, core RBAC component and separation of duty component are provided.
3
Design and Implementation of L-RBAC
L-RBAC was designed simply for the embedded system and designed in order to be satisfied the RBAC standard features[8]. Table 1 shows the function of a standard requesting and the function provided by L-RBAC. STD-6.2 property, a hierarchy, is more suitable to the large server system which is used by multi-user and multi-tasking. But it may not be suitable to the embedded system which is mostly used by the single user. Therefore, the STD6.2 property is excluded in our design. The core component, STD-6.1 property, must be provided in any RBAC and applied to our design though the commands shown at table 1 and the libraries corresponding to each command. Figure 1 shows the access policy between subject and object with role. The assumption Table 1. The L-RBAC functions according to the Standard of RBAC Command of L-RBAC
Standard number
System role : addrole, delrole, getallrole STD-6.1 (Core RBAC) User role : seturole, geturole
STD-6.1 (Core RBAC)
Process role : setprole, getprole
STD-6.1 (Core RBAC)
Object role : setfrole, getfrole
STD-6.1 (Core RBAC)
setssd, getssd
STD-6.3.1 (SSD relations)
setdsd, getdsd
STD-6.3.2 (DSD relations)
setcrole, getcrole
Cardinality (additional supported)
Implementation of LSM-Based RBAC Module for Embedded System
rwx
Si
95
Oj_any
(a) Case that RBAC is not applied
Si SRi
rwx rw x
ORj_any
SRi
ORj_rw
SRi
r wx rwx
ORj_r ORj_rwx
(b) Case that RBAC is applied
Fig. 1. Access model between subject and object. (a) Traditional situation. Subject S can do any type of access to object O if S has the ownership of O. (b) Secure situation. Subject S can do an only accepted type of access to object O if S has the permission of role assigned to O. In this case, S cannot access to O if S does not have the permission of role assigned to O even though S has the ownership to O
of Fig.1 is that S has the ownership to O, that is, a subject S can do any type of access to object O if O does not be set any role. The meaning of symbols and assumption is as follows. - S i : a subject i - S Ri : a subject i which has a permission of role R - Oj mode : an object j with mode permission. mode indicate r for read, w for write, x for execute and any for any permission type. - ORj mode : an object j assigned role R with mode permission but not assigned any role. mode is the same above. STD-6.3.1 and STD-6.3.2 show Static Separation of Duty (SSD) relations and Dynamic Separation of Duties(DSD) relations respectively. For example, if role1 and role2 were set up in SSD, they cannot be assigned to a user or an object simultaneously. And if role1 and role2 are set up in DSD, they cannot be activated at the same time but can be assigned to a user or an object. In our design, L-RBAC also provides the Cardinality that is the limitation of assignment of role. If role1 is set up in Cardinality with 3, the role1 cannot be assigned 4 times at the same time. This property can be used to prevent from abusing of role. Now, we describe the implementation of L-RBAC with LSM properties. Figure 2 shows the architecture of L-RBAC. The process of enforcement is started from the request of LSM hooks. LSM allows module to mediate access to kernel objects by placing hooks in the kernel code just ahead of the access, as shown in left side of Fig.2[4]. Just before the kernel would have accessed an internal object, a hook makes a call to a function that the L-RBAC module
96
J.-D. Lim et al.
User command Request
Result Security DB
LIB (Library) Request User Level Process
User Space
Result Read/ Write
/proc driver (interface)
Kernel Space syscall
Request
Error
Deny
get/set
DAC check OK LSM hook OK
L-RBAC module
syscall
Error check OK Deny
Result
Result
Request
DB mgr
Allow/Deny Result
get/set
Execute code Enforcer Return
Fig. 2. The architecture of L-RBAC. There are two pathes for enforcement and management. Enforcement is requested from LSM hooks and Management is requested from /proc device driver.
must provide. The L-RBAC module can either let the access granted or denied forcing an error code return. LSM is the security framework of the kernel level provided from the linux kernel 2.6 and more. And LSM has total 88 hooks and calls the function provided by the LSM module in the place where it is necessary to have the security check including the program loading, the file access, the IPC use, and etc. L-RBAC module consists of enforcer for determining access, DB manager for searching and setting security DB, system call and /proc device driver interface for managing RBAC. As shown Fig.2, RBAC management is achieved through /proc device driver. Each management command such as addrole, delrole and so on calls a correspondent system call and system call request a correspondent action to DB manager. DB manager modify or update the security DB and response to requester. Security DB stores the information that is used to determine if an access is granted or denied, in-cluding inode of object, role assigned to object, the device in which object is. To speed up to search DB, it is constructed by 4 hash list and is loaded into kernel mem-ory when L-RBAC is initialized.
Implementation of LSM-Based RBAC Module for Embedded System
4
97
Performance Overhead
The performance overhead of L-RBAC module is explained in this section. The overhead imposed by such a model is a composite of the LSM framework overhead and the actual RBAC enforcement overhead. We had two tests, macrobenchmark on PC and microbenchmark on target system. At First we use the widely used kernel compilation benchmark for macrobenchmark, measuring the time to build the Linux Kernel. We ran this test on a Pentium IV 3.00Ghz processor with 1 GB RAM. The conditions of this test are that there is no network activity, caches are flushed(that is rebooted), compiled with pre-configured ‘.config’ file and measurement values are acquired with time command. Table 2 shows the macrobenchmark test result. As shown at Table 2, the L-RBAC has 1.45% overhead at system space, 0.1% overhead at user space and just 0.24% overhead totally. We think that it is acceptable compared to the overhead of LSM only, 0–0.3%[4]. Table 2. The result of kernel compilation macrobenchmark Space
Without L-RBAC(time) With L-RBAC(time) Overhead
Real(user+sys)
31m 41.067s
31m 45.691s
0.24%
User
28m 15.110s
28m 16.774s
0.1%
Sys
1m 58.371s
2m 0.092s
1.45%
Table 3. The result of kernel compilation macrobenchmark Syscall
Without L-RBAC(time) With L-RBAC(time) Overhead
Open/close
11.8310ms
12.6815ms
7.2%
Stat
7.4214ms
8.0136ms
8.0%
Fork/exit
3,263.5000ms
3,263.5000ms
0%
Fork/exec
3,442.0000ms
3,438.5000ms
-0.1%
Secondly we used LMBench for microbenchmarking[10]. LMBench was developed specifically to measure the performance of core kernel system calls and facilities, such as file access, context switching, and memory access. We ran this test on a X-Hyper270A embedded system which is based on Intel PXA270 processor. It has the performance of the low power and high-end (520MHz) and then can be used for the mobile related products group. In addition, it also supports the Linux 2.6.11 and can use the LSM security framework. In target system, the macrobenmark (Kernel Soruce Compilation) is unable to be used due to the resources restriction of the Target System like the PC Version. Table 3 shows
98
J.-D. Lim et al.
the result of a microbenchmark test and the values are the mean value at each test after 100 time repetition. In [4], the open/close overhead of LSM only is 2.7% and that of L-RBAC is 7.2%. But the test of [4] was on high-end level system and just for LSM module without RBAC function and the test of L-RBAC was on low-end embedded system and for LSM module with RBAC function. So the overhead of L-RBAC is very acceptable.
5
Example of Application
In this section, the example in which the RBAC model is applied to the personal digital assistant such as smart phone which is the representative kind of the embedded system is illustrated. The current mobile smart phone provides various internet-based services and then stores personal data such as the personal identification number and authentication information etc. for the internet-based services. Figure 3(a) shows the process of the personal information being taken out from the smart phone by the Trojan horse when using the internet service. Hacker inserts the malicious code into the popular programs like games and uploads the modified programs. The inserted malicious code can have the activities like the Trojan horse having the personal information outflow function. And then, user downloads the modified program into the smart phone and executes it. At this time, regardless of the original program operation, the malicious code accesses the personal information and delivers it to the hacker not knowing to the owner of smart phone. This example that is shown at Fig.3(a) can be enough issued with the malware and/or malicious software which is illustrated in section 1. Figure 3(b) shows the process of protecting the personal information from the attack of the method as in the above by applying RBAC. A role is assigned to personal data like Fig.3(b) in advance for access control with RBAC by using one of L-RBAC commands introduced at Table 1, setf role which can assign a permission to an object with a role. Now the only subject that has an assigned role to object can access the object. A malicious code tries to access the personal data through the process like Fig.3(a). The access request is captured by LSM hook in kernel and then the request is passed to the Enforcer of L-RBAC with the subject ID and object ID. Enforcer requests the access decision to the DB manager. DB manager determines if the access is granted or denied through the security DB. At this time, the object assigned the permission of a role and subject hoes not has a role, so DB manager returns that this access should be denied. Like this process, L-RBAC module determines the access based on a role and permission assigned to the data and a role assigned to the malicious code that tries to access the data. In this case, the malicious code does not have a role that is able to access to data because it is the abnormal program. So the malicious code cannot access the personalization data protected with a role. The example in this section is very simple model in which RBAC is applied, but the effectiveness is very good.
Implementation of LSM-Based RBAC Module for Embedded System
99
Internet Download the infected contents(game etc.) with malware
Web server Upload the infected contents(game etc.) with malware
Address list, Certification, Account, …
execution
Hacker’s host (a) Service without L-RBAC
Internet Download the infected contents(game etc.) with malware
Web server Upload the infected contents(game etc.) with malware
execution
Address list, Certification, Account, X … ROLE Hacker’s host
Setfrole –a RW {objects} ROLE (b) Service with L-RBAC
Fig. 3. The example of L-RBAC application. In (a), personal data can be taken out by the Trajan horse. In (b), personal data can be protected by role of L-RBAC policy. The data had been set to be only accessed by the subject with role of ROLE. But malware code cannot access the data because it has been executed abnormally and then cannot have any role.
6
Conclusion
The embedded system is the particular system for the specific purpose. At the past, most embedded system was the stand-alone type and focused on performing function of the system itself. Now the network-based service is generalized
100
J.-D. Lim et al.
around the mobile personal digital assistants and the demand about the network service has been more increased. The usage of the internet service heightens the possibility of the external invasion and the sensitive personal data including the financial information, user certificates, personal address book, record of utilization, and etc. can be taken out stored in small and medium embedded system. In fact there are reported the presence of malicious code for embedded system especially mobile phones. Therefore, the system resources and the user data are needed to be safely protected from these crackings even though it is embedded system. This paper tries to strengthen the security of the embedded system by applying the Role Based Access Control(RBAC) function to the embedded linux to be used as OS of the embedded system and the role-based access control is implemented based on the Linux Security Module(LSM) framework and then able to be easily applied to the embedded system. The combination of RABC and LSM properties is very suitable for one of security solutions of embedded system because of the simplicity and flexibility of RBAC and a lightweight loadable mechanism of LSM. Although we introduced the simple example of application of RBAC for mobile smart phone, there is few practical application of RBAC for embedded system including home network devices, car computing system and so on as well as mobile assistants. So it should be studied that the specified application model of RBAC which is suitable for the various embedded computing environments such as home networking, car computing, sensor networking and so on.
References 1. F-Secure Mobile Anti-virus, http://mobile.f-secure.com/ 2. Sandhu, R.S., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role-based Access Control models. IEEE Computer 29(2), 38–47 (1996) 3. Ferraiolo, D., Cugini, J., Kuhn, D.R.: Role-based Access Control: Features and motivations. In: Annual Computer Security Applications Conference, IEEE Computer Society Press, Los Alamitos (1995) 4. Wright, C., Cowan, C., Smalley, S., Morris, J., Kroah-Hartman, G.: Linux Security Modules: General Security Support for the Linux Kernel. In: Proceedings of the 11th USENIX Security Symposium, pp. 17–31 (2002) 5. Trumper, W.: Summary about POSIX.1e (1999), http://wt.xpilot.org/publications/posix.1e 6. Spencer, R., Smalley, S., Loscocco, P., Hibler, M., Andersen, D., Lepreau, J.: The Flask Security Architecture: System Support for Diverse Security Policies. In: Proceedings of the Eighth USENIX Security Symposium, pp. 123–139 (1999) 7. Hallyn, S., Kearns, P.: Domain and Type Enforcement for Linux. In: Proceedings of the 4th Annual Linux Showcase and Conference (2000) 8. American National Standard for Information Technology – Role Based Access Control. ANSI/INCITS 359-2004 (2004)
Implementation of LSM-Based RBAC Module for Embedded System
101
9. Role Based Access Control Implementation Standard Version 0.1. draft-rbacimplementation-std-v01 (2006), http://csrc.nist.gov/rbac 10. McVoy, L.W., Staelin, C.: lmbench: Portable Tools for Performance Analysis. In: USENIX Annual Technical Conference (1996), http://www.bitmover.com/lmbench/
Iteration Bound Analysis and Throughput Optimum Architecture of SHA-256 (384,512) for Hardware Implementations Yong Ki Lee1 , Herwin Chan1 , and Ingrid Verbauwhede1,2 1
University of California, Los Angeles, USA Katholieke Universiteit Leuven, Belgium {jfirst,herwin,ingrid}@ee.ucla.edu
2
Abstract. The hash algorithm forms the basis of many popular cryptographic protocols and it is therefore important to find throughput optimal implementations. Though there have been numerous published papers proposing high throughput architectures, none of them have claimed to be optimal. In this paper, we perform iteration bound analysis on the SHA2 family of hash algorithms. Using this technique, we are able to both calculate the theoretical maximum throughput and determine the architecture that achieves this throughput. In addition to providing the throughput optimal architecture for SHA2, the techniques presented can also be used to analyze and design optimal architectures for some other iterative hash algorithms. Keywords: SHA-256 (384,512), Iteration Bound Analysis, Throughput Optimum Architecture.
1
Introduction
Hash algorithms produce a fixed size code independent of input size of messages. Generated codes from hash algorithms are commonly used for digital signature [1] and message authentication. Since the hash outputs are relatively small compared to the original messages, the hash algorithms take an important roll for computation efficiency. Considering the increasing data amount to store or communicate, the throughput of hash algorithms is an important factor. Common hash algorithms include SHA1, MD5, SHA2 family (SHA256, SHA384 and SHA512) and RMD family (RMD160, RMD256 and RMD320). The SHA2 family of hash algorithms [2] has become of particular interest lately due the official support of the National Institute of Standards and Technology (NIST) in 2002. Even though many publications were produced to show high throughput architectures of SHA2, there has been no mathematical analysis of the delay bound. In this paper, we analyze the iteration bound analysis of SHA2, which gives us the maximum theoretical throughput achievable by the algorithm. Knowing the iteration bound not only allows the designer a goal to design towards, but also signals the designer when an optimal architecture has been achieved. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 102–114, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
Iteration Bound Analysis and Throughput Optimum Architecture
103
The remainder of this paper is organized as follows. We start with reviewing related work in Section 2 and introduce the iteration bound analysis and transformations in Section 3. In Section 4, we analyze the iteration bound of the SHA2 family of hash algorithms and show the procedure to design throughput optimum architectures. Some comments for implementation and synthesis results are given in Section 5 followed by the conclusion in Section 6.
2
Related Works
The most common techniques for high throughput hardware implementations of hash algorithms are pipelining, unrolling and using Carry Save Adder (CSA). Pipelining techniques are used in [4,5,6,7], unrolling techniques are used in [7,8,9,10], and CSA techniques are used in [4,5,6,7,10,11]. Some other interesting implementations can be found in [12,13]. Even though there are many published papers, the mathematical analysis of the iteration bound has rarely been performed. For example, even though the iteration bound for SHA2 was achieved in [4], no analysis or proof of optimality was given. Since there is no proof of theoretical optimality and systematic approach to achieve the optimality, many architectures seem to count on the designers’ intuition. Actually the work of [4] achieved the theoretical optimum after another work [5]. Therefore, this work will not only prevent a futile attempt to design architecture achieving better throughput than the theoretical optimum but also will guide designers to achieve the theoretical optimum throughput in MD4-based hash algorithms.
3
The Iteration Bound Analysis and Transformations
The SHA2 family of hash algorithms are iterative algorithms, which means the output of one iteration is the input of the next. We use a Data Flow Graph (DFG) to represent dependencies. We continuously apply the techniques of retiming and unfolding to achieve the iteration bound. Even though the optimized SHA2 family of hash algorithms requires only the retiming transformations, both transformations are briefly discussed for a better understanding of the analysis technique. Some of the MD4-based hash algorithms may require the unfolding transformation. A more detailed discussion of the iteration bound and the transformations can be found in [3]. 3.1
DFG Representation
The mathematical expression of our example is given in Eq. 1. A and B are variables which are stored in registers and the indices of the variables represent the number of iterations of the algorithm. C1 and C2 are some constants. According to the equation, the next values of variables are updated using the current values of variables and constants. This type of equations is very common in MD4-based hash algorithms.
104
Y.K. Lee, H. Chan, and I. Verbauwhede
A(n + 1) = A(n) + B(n) ∗ C1 ∗ C2
(1)
B(n + 1) = A(n) The DFG of Eq. 1 is shown in Fig. 1. Box A and B represent registers which give the output at cycle n, and circles represent some functional nodes which perform the given functional operations. A D on edges represents an algorithmic delay, i.e. a delay that cannot be removed from the system. Next to algorithmic delays, nodes also have functional delays. We express the functional delays, i.e. the delays to perform the given operations, of + and ∗ as P rop(+) and P rop(∗) respectively. The binary operators, + and ∗, can be arbitrary operators but we assume P rop(+) < P rop(∗) in this example. The iteration bound analysis starts with an assumption that any functional operation is atomic. This means that a functional operation can not be split or merged into some other functional operations. 3.2
The Iteration Bound Analysis
A loop is defined as a path that begins and ends at the same node. In the DFG in Fig. 1, A−−→ +−−− −→ B−−→ ∗−−→ ∗−−→ +−−− −→ A are the −→ A and A−−− D D D loops. The loop calculation time is defined as the sum of the functional delays in a loop. If tl is the loop calculation time and wl is the number of algorithmic delays in the l-th loop, the l-th loop bound is defined as tl /wl . The iteration bound is defined as follows: 0002 0003 tl T∞ = max (2) l∈L wl where L is the set of all possible loops. The iteration bound creates a link between the arithmetic delay and the functional delay. It is the theoretical limit of a DFG’s delay bound. Therefore, it defines the maximally attainable throughput. Please note that every loop needs to have at least one algorithmic delay in the loop otherwise the system is not causal and cannot be executed.
hOPG GkG
hG
RG
GkG
v
iG
iOP
QG
QG
GjYG
GjXG
Fig. 1. An example of DFG
Iteration Bound Analysis and Throughput Optimum Architecture
105
Since the loop marked with bold line has the maximum loop delay assuming that P rop(+) < P rop(∗), the iteration bound is as follows: 0002 0003 P rop(+) + 2 × P rop(∗) T∞ = max P rop(+), (3) 2 P rop(+) + 2 × P rop(∗) = 2 This means that a critical path delay in this DFG can not be less than this iteration bound. The critical path delay is defined as the maximum calculation delay between any two consecutive algorithmic delays, i.e. D’s. In our example (Fig. 1), the critical path delay is P rop(+) + 2 × P rop(∗) which is larger than the iteration bound. The maximum clock frequency (and thus throughput) is determined by the critical path (the slowest path). The iteration bound is a theoretical lower bound on the critical path delay of an algorithm. We use the retiming and unfolding transformations to reach this lower bound. 3.3
The Retiming Transformation
The minimum critical path delay that can be possibly achieved using the retiming transformation is shown in Eq. 4. 0004 0005 P rop(+) + 2 × P rop(∗) 0004T∞ 0005 = = P rop(+) + P rop(∗) (4) 2 Assuming that a functional node can not be split into multiple parts, 0004·0005 is the maximum part when P rop(+) + 2 × P rop(∗) is evenly distributed into N parts, where N is the number of algorithmic delays in a loop. This is denoted by the 2 in our example and sits in the denominator. Since the total delay P rop(+) + 2 × P rop(∗) can be partitioned into one delay P rop(+) + P rop(∗) and the other delay P rop(∗), the attainable delay bound by the retiming transformation is P rop(+) + P rop(∗). The retiming transformation modifies a DFG by moving algorithmic delays, i.e. D’s, through the functional nodes in the graph. Delays of out-going edges can be replaced with delays from in-coming edges and vice versa. Fig. 2 shows the retiming transformation steps performed on Fig. 1. Based on the + node in Fig. 1, the delay of the out-going edge is replaced with delays of the in-coming edges resulting in Fig. 2(a). Note that the out-going edges and the in-coming edges must be dealt as a set. By performing one more retiming transformation based on the left ∗ node in Fig. 2(a), we obtain the DFG of Fig. 2(b). Therefore, the critical path becomes the path in bold between the two bolded D’s in Fig. 2(b) and its delay is reduced to P rop(+) + P rop(∗) which is the same as Eq. 4. However, the iteration bound still has not been met. 3.4
The Unfolding Transformation
The unfolding transformation improves performance by calculating several iterations in a single cycle. For the unfolding transformation we expand the Eq. 1 by representing two iterations at a time, which results in Eq. 5.
106
Y.K. Lee, H. Chan, and I. Verbauwhede
hOPG GkG
hG
v
iG
iOP
GkG
RG
GkG
QG
QG
GjYG (a) Retimed based on +
GjXG
hOPG GkG
hG
v
iG
iOP
GkG
RG
QG
GkG
GjYG (b) Retimed based on ∗
QG GjXG
Fig. 2. Retiming Transformation
A(n + 2) = A(n + 1) + B(n + 1) ∗ C1 ∗ C2 = A(n) + B(n) ∗ C1 ∗ C2 + A(n) ∗ C1 ∗ C2
(5)
B(n + 2) = A(n + 1) = A(n) + B(n) ∗ C1 ∗ C2 Note that now A(n + 2) and B(n + 2) are expressed as a function of A(n) and B(n). By introducing a temporary variable T mp, Eq. 5 can be simplified into Eq. 6. T mp(n) = A(n) + B(n) ∗ C1 ∗ C2 A(n + 2) = T mp(n) + A(n) ∗ C1 ∗ C2
(6)
B(n + 2) = T mp(n) By doubling the number of functional nodes, we are able to unfold the DFG by a factor of two (Fig. 3(a)). Box A and B now give the outputs of every second iteration. By applying the retiming transformation to the unfolded DFG, the resulting critical path becomes the path in bold between the two bolded D’s which is D → + → A → ∗ → ∗ → D (Fig. 3(b)). Therefore, the critical path delay is P rop(+) + 2 × P rop(∗). Due to the unfolding factor of two, the
Iteration Bound Analysis and Throughput Optimum Architecture
107
hOYPG GkG
GkG
hG
RG RG
v
iG
iOYPG
QG
QG
GjYG
GjXG
QG
QG
GjYG GjXG (a) Unfolding Transformation
hOYPG GkG
hG
RG GkG
RG
GkG
vG
iG
iOYPG
QG
QG
GjYG
GjXG
QG
QG
GjYG GjXG (b) Retiming Transformation
Fig. 3. Unfolding and Retiming Transformation
normalized critical path delay, Tˆ , can be calculated by dividing the critical path delay by two as shown in Eq. 7. P rop(+) + 2 × P rop(∗) Tˆ = = T∞ 2
(7)
This final transformation results in an architecture that achieves the iteration bound of the example DFG (Fig. 1).
108
Y.K. Lee, H. Chan, and I. Verbauwhede
Now the only remaining step is the implementation of the resulting DFG. Note that some of the square nodes are not any more paired with an algorithmic delay, which can be seen in Fig. 3(b). The explanation about how this issue is dealt with during implementation will be given in Section 5 where we synthesize the SHA2 family hash algorithms.
4
Iteration Bound Analysis and Throughput Optimum Architecture of SHA2
The SHA2 family of hash algorithms is composed of three parts: the padding, expander and compressor [2]. The padding extends an input message to be a whole number of 512-bit (for SHA-256) or 1024-bit (for SHA-384 and SHA-512) message blocks. The expander enlarges input messages of 16 words into 64 (for SHA-256) or 80 (for SHA-384 or SHA-512) words. The compressor encodes the expanded message into 256, 384 or 512 bits depending on the algorithm. For one message block, the required iterations are 64 (for SHA-256) or 80 (for SHA-384 or SHA-512). Ch(x, y, z) = (x ∧ y) ⊕ (¬x ∧ z) M aj(x, y, z) = (x ∧ y) ⊕ (x ∧ z) ⊕ (y ∧ z) {256} (x) = ROT R2 (x) ⊕ ROT R13 (x) ⊕ ROT R22 (x) Σ0 {256} (x) = ROT R6 (x) ⊕ ROT R11 (x) ⊕ ROT R25 (x) Σ1 {256} (x) = ROT R7 (x) ⊕ ROT R18 (x) ⊕ SHR3 (x) σ0 {256} (x) = ROT R17 (x) ⊕ ROT R19 (x) ⊕ SHR10 (x) σ1 (a) SHA-256 Functions 0006 Wt =
(i)
{256} (Wt−2 ) σ1
Mt {256} + Wt−7 + σ0 (Wt−15 ) + Wt−16
0 ≤ t ≤ 15 16 ≤ t ≤ 63
(b) SHA-256 Expend Computation {256}
{256}
T1 = h + Σ 1 (e) + Ch(e, f, g) + Kt {256} T2 = Σ 0 (a) + M aj(a, b, c) h=g g=f f =e e = d + T1 d=c c=b b=a a = T1 + T2
(c) SHA-256 Compress Computation Fig. 4. SHA-256 Hash Computation
+ Wt
Iteration Bound Analysis and Throughput Optimum Architecture
109
In this paper, we consider only the implementation of the expander and compressor. Though the expander can be performed before the compressor, we chose to implement it to perform dynamically during compression in order to increase the overall throughput and minimize the gate area. Fig. 4 describes the SHA-256 algorithm on which a DFG will be derived based. Since all the SHA2 family hash algorithms have the same architecture except for input, output and word sizes, constants, non-linear scrambling functions, i.e. Σ0 , Σ1 , M aj, Ch, σ0 and σ1 , and the number of the iterations, they can be expressed in the same DFG. 4.1
DFG of SHA2 Compressor
Since within one iteration the order of additions in SHA2 does not affect the results, there are several possible DFG’s. For example, (a + b) + c and (b + c) + a are equivalent in mathematics but will have different DFG’s. As a starting point, the DFG having the minimum iteration bound must be chosen, transformations are then performed to find the architecture that achieves this bound. In SHA2 compressor, since there are only 7 adders, finding a DFG having the minimum iteration bound is not difficult as long as we understand how to calculate the iteration bound. The DFG in Fig. 5 is a straightforward DFG. The shaded loop indicates the loop having the largest loop bound and gives the following iteration bound. 0002 0003 tl ( 5) = max = 3 × P rop(+) + P rop(Ch) (8) T∞ l∈L wl However, by reordering the sequence of additions, the DFG of Fig. 6 can be obtained which has the smallest iteration bound. As we assume that P rop(Σ0) ≈ P rop(M aj) ≈ P rop(Σ1) ≈ P rop(Ch), the two bolded loops have the same maximum loop bound. Since the loop bound of the left hand side loop cannot be reduced further, no further reduction in the iteration bound is possible. Therefore, the iteration bound of Fig. 6 is as follows. 0002 0003 tl ( 6) = 2 × P rop(+) + P rop(Ch) (9) T∞ = max l∈L wl If we assume that any operation in the DFG cannot be merged or split into other operations, the iteration bound of SHA2 is Eq. 9. However, if we are allowed to use a Carrier Save Adder (CSA), we can substitute two consecutive adders with one CSA and one adder. Since CSA requires less propagation delay than an adder, we replace adders with CSA’s if it is possible. The resulting DFG is shown in Fig. 7. Note that some of the adders are not replaced with CSA since doing so would increase the iteration bound. Therefore, the final iteration bound is achieved as Eq. 10. 0002 0003 tl ( 7) = P rop(+) + P rop(CSA) + P rop(Ch) (10) T∞ = max l∈L wl
110
GkG
Y.K. Lee, H. Chan, and I. Verbauwhede
hG
GkG
Σ0
iG
GkG
jG
GkG
kG
RG
GkG
tG
.
RG
lG
GkG
mG
GkG
nG
GkG
oG
Σ1
jG
RG
rG
RG
RG
RG
~
RG Fig. 5. Basic SHA2 Compressor DFG
GkG
hG
GkG
iG
GkG
jG
GkG
kG
GkG
lG
GkG
mG
GkG
nG
GkG
oG
Σ1
.
jG
.
RG
rG
RG
RG
RG
RG
~
RG
RG
RG
RG
Σ0
tG
Fig. 6. Optimized SHA2 Compressor DFG
GkG
hG
Σ0
GkG
iG
tG
GkG
jG
GkG
kG
GkG
lG
Σ1
GkG
mG
GkG
nG
GkG
. .
jG
oG
jzh
jzhG
RG
jzhG
RG
RG
RG
jzhG
RG
Fig. 7. Optimized SHA2 Compressor DFG with CSA
rG ~G
Iteration Bound Analysis and Throughput Optimum Architecture
GkG
hG
GkG
iG
jG
kG
GkG
lG
GkG
mG
jzhG
oG
GkG
GkG
Σ0
nG
111
Σ1
tG
GkG
RG
jzh
. .
jG
jzhG
GYkG
GkG rRYG
~RZG
RG GkG
RG
RG
jzhG
GkG
RG
Fig. 8. Final SHA2 Compressor DFG with Retiming Transformation
In the next step, we perform transformations. Since there is no fraction in the iteration bound, we do not need the unfolding transformation. Only the retiming transformation is necessary to achieve the iteration bound. The retimed DFG achieving the iteration bound is depicted in Fig. 8. Note that the indexes of Kt+2 and Wt+3 are changed due to the retiming transformation. In order to remove the ROM access time for Kt+2 , which is a constant value from ROM, we place an algorithmic delay, i.e. D, in front of Kt+2 . This does not change the function. 4.2
DFG of SHA2 Expander
A straightforward DFG of the SHA2 expander is given in Fig. 9(a). Even though the iteration bound of the expander is much less than the compressor, we do not need to minimize the expander’s critical path delay less than the compressor’s iteration bound (the throughput is bounded by the compressor’s iteration bound). Fig. 9(b) shows a DFG with CSA, and Fig. 9(c) shows a DFG with the retiming transformation where the critical path delay is P rop(+).
5
Implementation and Synthesis Results
In the DFG of Fig. 8, some of the register values, i.e. A, B, .., H, are no longer paired with an algorithmic delay D. For example, there is no algorithmic delay between registers F and H. Therefore, the values of H will be the same as F except for the first two cycles: in the first cycle, the value of H should be the initialized value of H according to the SHA2 algorithm; in the second cycle the value of H should be the initialized value of G. Therefore, the value of F will be directly used as an input of the following CSA. Another register management problem is found in the path from the register H to the register A which includes four algorithmic delays. Therefore, register A has to hold its initial value until an effective value of H reaches to the register A, which means the register A must hold the first three cycles and then it can update
112
ɍ0011
Gk
~
Gk
~RX
Gk
~RY
Gk
~RZ
Gk
~R[
Gk
~R
Gk
~R]
Gk
~R^
Gk
~R_
Gk
~R`
Gk
~RXW
Gk
~RXX
Gk
~RXY
Gk
~RXZ
Gk
~RX[
~RX
Gk
Y.K. Lee, H. Chan, and I. Verbauwhede
ɍ0010
R
R
R
(a) Basic DFG
ɍ0011
R
Gk
~
Gk
~RX
Gk
~RY
Gk
~RZ
Gk
~R[
Gk
~R
Gk
~R]
Gk
~R^
Gk
~R_
Gk
~R`
Gk
~RXW
Gk
~RXX
Gk
~RXY
Gk
~RXZ
Gk
~RX[
~RX
Gk
ɍ0010
jzh
jzh
(b) DFG with CSA
jzh
Gk
~
Gk
Gk
~RX
ɍ0011
R
Gk
~RY
Gk
~RZ
Gk
~R[
Gk
~R
Gk
~R]
Gk
~R^
Gk
~R_
Gk
~R`
Gk
~RXW
Gk
~RXX
Gk
~RXY
Gk
~RXZ
~RX[
~RX
Gk
ɍ0010 jzh
(c) Retiming Transformed DFG with CSA
Fig. 9. SHA2 Expander DFG
with a new value. This causes overhead of three extra cycles. Therefore the total number of cycles required for one message block is the number of iterations plus one cycle for initialization and finalization plus three overhead cycles due to the retiming transformation, which results in 68 cycles for SHA256 and 84 cycles for SHA384 and SHA512. We synthesized SHA2 for an ASIC by Synopsys Design Vision using a 0.13μm standard cell library whose results and a comparison with other works are shown in Table 1. The throughputs are calculated using the following equation. F requency # of Cycles × (512 bits) T hroughput384,512 = #F requency of Cycles × (1024 bits)
T hroughput256 =
(11)
Note that our purpose of the synthesis is not to beat the throughput record but to verify our architecture by checking the correct hash outputs and the actual critical path. Since our HDL programming is done at register transfer level and we have mostly concentrated on optimizing micro-architecture rather than focusing lower-level optimization, some other reported results, e.g. [4], achieve
Iteration Bound Analysis and Throughput Optimum Architecture
113
Table 1. Synthesis Results and Comparison of SHA2 Family Hashes Technology Area Frequency Throughput Cycles (ASIC) (Gates) (MHz) (Mbps) [14] SHA256 0.18μm 22,000 200 65 1,575 SHA256 15,329 333.3 72 2,370 [13] 0.13μm SHA384/512 27,297 250.0 88 2,909 [4] SHA256 0.13μm N/A >1,000 69 >7,420 Our SHA256 22,025 793.6 68 5,975 0.13μm Results SHA384/512 43,330 746.2 84 9,096 *All our proposals include all the registers and ROM. Algorithm
better performance with the same iteration bound delay. However the iteration bound analysis still determines the optimum high level architecture of an algorithm.
6
Conclusion
We analyzed the iteration bound of the SHA-256 (384,512) hash algorithms and showed architectures achieving the iteration bound. Since the iteration bound is a theoretical limit, there will be no further throughput optimization in microarchitecture level. We also synthesized our design to verify the correctness of our architecture design. Moreover, we illustrated detailed steps from a straightforward DFG to a throughput optimized architecture. This approach will guide how to optimize some other iterative hash algorithms in throughput. Acknowledgments. This work is supported by NSF CCF-0541472, FWO and funds from Katholieke Universiteit Leuven.
References 1. Digital Signature Standard. National Institute of Standards and Technology. Federal Information Processing Standards Publication 186-2, http://csrc.nist.gov/publications/fips/fips186-2/fips186-2-change1.pdf 2. Secure Hash Standard. National Institute of Standards and Technology. Federal Information Processing Standards Publication 180-2, http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf 3. Parhi, K.K.: VLSI Digital Signal Processing Systems: Design and Implementation, pp. 43–61, 119–140. Wiley, Chichester (1999) 4. Dadda, L., Macchetti, M., Owen, J.: An ASIC design for a high speed implementation of the hash function SHA-256 (384, 512). In: ACM Great Lakes Symposium on VLSI, pp. 421–425 (2004) 5. Dadda, L., Macchetti, M., Owen, J.: The design of a high speed ASIC unit for the hash function SHA-256 (384, 512). In: DATE 2004. Proceedings of the conference on Design, Automation and Test in Europe, pp. 70–75. IEEE Computer Society Press, Los Alamitos (2004)
114
Y.K. Lee, H. Chan, and I. Verbauwhede
6. Macchetti, M., Dadda, L.: Quasi-pipelined hash circuits. In: ARITH 2005. Proceedings of the 17th IEEE Symposium on Computer Arithmetic, pp. 222–229 (2005) 7. McEvoy, R.P., Crowe, F.M., Murphy, C.C., Marnane, W.P.: Optimisation of the SHA-2 Family of Hah Functions on FPGAs. In: ISVLSI 2006. Proceedings of the 2006 Emerging VLAI Technologies and Architectures, pp. 317–322 (2006) 8. Michail, H., Kakarountas, A.P., Koufopavlou, O., Goutis, C.E.: A Low-Power and High-Throughput Implementation of the SHA-1 Hash Function. In: ISCAS 2005. IEEE International Symposium on Circuits and Systems, pp. 4086–4089 (2005) 9. Crowe, F., Daly, A., Marnane, W.: Single-chip FPGA implementation of a cryptographic co-processor. In: FPT 2004. Proceedings of the International Conference on Field Programmable Technology, pp. 279–285 (2004) 10. Lien, R., Grembowski, T., Gaj, K.: A 1 Gbit/s partially unrolled architecture of hash functions SHA-1 and SHA-512. In: Okamoto, T. (ed.) CT-RSA 2004. LNCS, vol. 2964, pp. 324–338. Springer, Heidelberg (2004) 11. Ming-yan, Y., Tong, Z., Jin-xiang, W., Yi-zheng, Y.: An Efficient ASIC Implementation of SHA-1 Engine for TPM. In: The 2004 IEEE Asia-Pacific Conference on Circuits and Systems, pp. 873–876 (2004) 12. Ganesh, T.S., Sudarshan, T.S.B.: ASIC Implementation of a Unified Hardware Architecture for Non-Key Based Cryptographic Hash Primitives. In: ITCC 2005. Proceedings of the International Conference on Information Technology: Coding and Computing, pp. 580–585 (2005) 13. Satoh, A., Inoue, T.: ASIC-Hardware-Focused Comparison for Hash Functions MD5, RIPEMD-160, and SHS. In: ITCC 2005. Proceedings of the International Conference on Information Technology: Coding and Computing, pp. 532–537 (2005) 14. Helion IP Core Products. Helion Technology http://heliontech.com/core.htm
A Compact Architecture for Montgomery Elliptic Curve Scalar Multiplication Processor Yong Ki Lee1 and Ingrid Verbauwhede1,2 1
University of California, Los Angeles, USA Katholieke Universiteit Leuven, Belgium {jfirst,ingrid}@ee.ucla.edu
2
Abstract. We propose a compact architecture of a Montgomery elliptic curve scalar multiplier in a projective coordinate system over GF (2m ). To minimize the gate area of the architecture, we use the common Z projective coordinate system where a common Z value is kept for two elliptic curve points during the calculations, which results in one register reduction. In addition, by reusing the registers we are able to reduce two more registers. Therefore, we reduce the number of registers required for elliptic curve processor from 9 to 6 (a 33%). Moreover, a unidirectional circular shift register file reduces the complexity of the register file, resulting in a further 17% reduction of total gate area in our design. As a result, the total gate area is 13.2k gates with 314k cycles which is the smallest compared to the previous works. Keywords: Compact Elliptic Curve Processor, Montgomery Scalar Multiplication.
1
Introduction
Even though the technology of ASIC advances and its implementation cost decreases steadily, compact implementations of security engines are still a challenging issue. RFID (Radio Frequency IDentification) systems, smart card systems and sensor networks are good examples which need very compact security implementations. Public key cryptography algorithms seem especially taxing for such applications. However, for some security properties such as randomized authentications and digital signatures, the use of public key cryptography algorithms is often inevitable. Among public key cryptography algorithms, elliptic curve cryptography is a good candidate due to its efficient computation and relatively small key size. In this paper, we propose an architecture for compact elliptic curve multiplication processors using the Montgomery algorithm [1]. The Montgomery algorithm is one of the most popular algorithms in elliptic curve scalar multiplication due to its resistance to side-channel attack. We use the projective coordinate system to avoid inverse operations. In order to minimize the system size, we propose new formulae for the common projective coordinate system where all the Z-coordinate values are equal. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 115–127, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
116
Y.K. Lee and I. Verbauwhede
When we use L´opez-Dahab’s Montgomery scalar multiplication algorithm [2], two elliptic curve points must be kept where X and Z-coordinate values for each point. Therefore, by the use of the common Z projective coordinate property, one register for a Z-coordinate can be reduced. Considering that the register size is quite large, e.g. 163, reducing even one register is a very effective way to minimize the gate area. Moreover, efficient register management by reuse of the registers makes it possible to reduce two additional registers. Therefore, we reduce three registers out of nine in total compared to a conventional architecture. In addition, we design a unidirectional circular shift register file to reduce the complexity of the register file. While the multiplexer complexity of a register file increases as the square of the number of the registers, that of our register file is a small constant. Therefore, the proposed register file architecture effectively reduces the overall area. Though the register file is small (6 registers) an additional 17% of gate area is reduced using this technique. We also show the synthesis results for various digit sizes where the smallest area is 13.2k gates with the cycles of 314k. The remainder of this paper is organized as follows. In Section 2, we review the background on which our work is based. In Section 3, the common Z projective coordinate system is introduced and its corresponding formulae are given. The proposed system architecture and the synthesis results are shown in Section 4 and Section 5 followed by the conclusion in Section 6.
2
Background
2.1
L´ opez-Dahab’s Montgomery Scalar Multiplication
In this section we introduce L´ opez-Dahab’s Montgomery scalar multiplication algorithm, which uses a projective coordinate system [2]. The algorithm is shown in Fig. 1. A non-supersingular elliptic curve E over GF (2m ) is the set of coordinative points (x, y) satisfying y 2 + xy = x3 + ax2 + b with the point at infinity O, where a, b, x, y ∈ GF (2m ) and b 0003= 0.
l−2 Input: A point P = (x, y) ∈ E and a positive integer k = 2l−1 + Σi=0 ki 2i Output: Q = kP
1. 2. 3.
4.
if (k = 0 or x = 0) then output (0, 0) and stop X1 ← x, Z1 ← 1, X2 ← x4 + b, Z2 ← x2 for i = l − 2 to 0 do if ki = 1 then (X1 , Z1 ) ← Madd(X1 , Z1 , X2 , Z2 ), (X2 , Z2 ) ← Mdouble(X2 , Z2 ) else (X2 , Z2 ) ← Madd(X2 , Z2 , X1 , Z1 ), (X1 , Z1 ) ← Mdouble(X1 , Z1 ) return Q ← Mxy(X1 , Z1 , X2 , Z2 )
Fig. 1. Montgomery scalar multiplication with L´ opez-Dahab algorithm
A Compact Architecture for Montgomery Elliptic Curve
117
The adding formula of (XAdd , ZAdd ) ← Madd(X1 , Z1 , X2 , Z2 ) is defined in Eq. 1. ZAdd = (X1 × Z2 + X2 × Z1 )2
(1)
XAdd = x × ZAdd + (X1 × Z2 ) × (X2 × Z1 ) The doubling formula of (XDouble , ZDouble ) ← Mdouble(X2 , Z2 ) is defined in Eq. 2. ZDouble = (X2 × Z2 )2 XDouble =
X24
+b×
(2)
Z24
Q ← Mxy(X1 , Z1 , X2 , Z2 ) is the conversion of projective coordinate to affine coordinate. L´ opez-Dahab’s adding and doubling algorithms are described in Fig. 2 where c2 = b. The total number of registers in Fig. 2 is six, i.e. the registers for X1 , Z1 , X2 , Z2 , T1 and T2 . The total field operations of Adding Algorithm are 4 multiplications, 1 square and 2 additions, and those of Doubling Algorithm are 2 multiplications, 4 squares and 1 addition. Note that it is not necessary to maintain Y -coordinate during the iterations since it can be derived at the end of the iterations. Adding Algorithm (X1 , Z1 ) ← Madd(X1 , Z1 , X2 , Z2 ) 1. T1 ← x 2. X1 ← X1 × Z2 3. Z1 ← Z1 × X2 4. T2 ← X1 × Z1 5. Z1 ← Z1 + X1 6. Z1 ← Z12 7. X1 ← Z1 × T1 8. X1 ← X1 + T2
Doubling Algorithm (X, Z) ← Mdouble(X, Z) 1. T1 ← c 2. X ← X 2 3. Z ← Z 2 4. T1 ← Z × T1 5. Z ← Z × X 6. T1 ← T12 7. X ← X 2 8. X ← X + T1
Fig. 2. L´ opez-Dahab’s Adding and Doubling Algorithms
2.2
Modular Arithmetic Logic Unit (MALU) and Elliptic Curve Processor Architecture
In order to perform the field operations, i.e. the multiplications, squares and additions in Fig. 2, we need an Arithmetic Logic Unit (ALU). Fig. 3 shows the MALU architecture of K. Sakiyama et al [5]. This is a compact architecture which performs the arithmetic field operations as shown in Eq. 3. C(x) = A(x) ∗ B(x) mod P (x)
if cmd = 1
C(x) = B(x) + C(x) mod P (x)
if cmd = 0
(3)
where A(x) = Σai xi , B(x) = Σbi xi , C(x) = Σci xi and P (x) = x163 + x7 + x6 + x3 + 1.
118
Y.K. Lee and I. Verbauwhede yhtG
ths G X]ZG
hG
X]ZG iG
X]ZG jG
ccGG X]ZG
G
GYG
hG
X]YG
WG
UUUG
WG
U
UUU
UUU
UU UUU
X]ZG
{G
UUUG
UU UUU UUU UUU UUU UUU UUU UUU UUU UUU UUU UUU UUU
{G
UU UUU
UUU
GTXG X]ZG
UUU
UUU
UUUG
hX]ZTRXG
GXG
iG
¢G
hX]XG
UU UUU
jG
UU
UUU
GWG
¢G
hX]ZG hX]YG
¢G
UUUG
X]YG
G
{G
G
(a) MALU Architecture
(b) Cell Architecture
Fig. 3. MALU Architecture
yvtG jG GG G
hs
yhtG Fig. 4. MALU based Elliptic Curve Processor Architecture
d is the digit size and the number of cells. The square operation uses the same logic as the multiplication0002by duplicating the operand. The arithmetic multipli0003 and one cycle respectively. The benefit of this cation and addition take 163 d architecture is that the multiplication, the square and the addition operations share the XOR array and by increasing the digit size, the MALU can be easily scaled. The architecture of our ALU starts from this MALU. ECP (Elliptic Curve Processor) architecture based on MALU is shown in Fig. 4 [6]. Note that in Fig. 4, ALU is implemented with MALU and hence includes three registers, and RAM contains five words of 163 bit size.
A Compact Architecture for Montgomery Elliptic Curve
2.3
119
Implementation Consideration
If L´ opez-Dahab’s Montgomery scalar multiplication algorithm is implemented using Sakiyama’s MALU in a conventional way, the total number of registers is 9, i.e. 3 registers for MALU plus 6 registers for the Montgomery scalar multiplication. In [6], 3 registers and 5 RAMs are used (8 memory elements in total). One register is reduced by modifying L´ opez-Dahab’s algorithm and assuming that constants are loadable directly to the MALU without using a register. In our architecture, we are able to reduce the number of registers to 6 even without constraining ourselves to these assumptions. This was accomplished by observing the fact that the area of a scalar multiplier is dominated by register area. Note that the registers occupy more than 80% of the gate area in a conventional architecture. Therefore, reducing the number of the registers and the complexity of the register file is a very effective way to minimize the total gate area. Accordingly, our compact architecture is achieved in two folds: reducing the number of registers (one register reduction by using the common Z projective coordinate system and two register reduction by register reuse) and reducing the register file complexity by designing a unidirectional circular shift register file.
3
Common Z Projective Coordinate System
We propose new formulae for the common Z projective coordinate system where the Z values of two elliptic curve points in Montgomery scalar multiplication are kept to be the same during the process. New formulae for the common Z projective coordinate system have been proposed over prime fields in [3]. However, this work is still different from ours in that first, they made new formulae over prime field while ours is over binary polynomial field and second, they made new formulae to reduce the computation amount in special addition chain while our formulae slightly increase the computation amount in order to reduce the number of the registers. Please note that reducing even one register decreases the total gate area considerably. Since in L´ opez-Dahab’s algorithm, two elliptic curve points must be maintained, the required number of registers for this is four (X1 , Z1 , X2 and Z2 ). Including two temporary registers (T1 and T2 ), the total number of registers is six. The idea of the common Z projective coordinate system is to make sure that opez-Dahab’s algorithm. The condition at the beZ1 = Z2 at each iteration of L´ ginning of the iterations is satisfied since the algorithm starts the iterations with the initialization of Z1 = Z2 = 1. Even if Z1 0003= Z2 , we can make it satisfy this condition using three field multiplications as shown in Eq. 4 where the resulting coordinate set is (X1 , X2 , Z). X1 ← X1 × Z2 X2 ← X2 × Z1 Z ← Z1 × Z2
(4)
120
Y.K. Lee and I. Verbauwhede Table 1. The comparison between the original and the modified formulas
XAdd
The original equation The new equation assuming Z = Z1 = Z2 ZAdd = (X1 × Z2 + X2 × Z1 )2 ZAdd = (X1 + X2 )2 = x × ZAdd + (X1 × Z2 ) × (X2 × Z1 ) XAdd = x × ZAdd + X1 × X2
Since we now assume Z1 = Z2 , we can start the Adding Algorithm with the common Z projective coordinate system. With Z = Z1 = Z2 , Eq. 1 is re-represented as shown in Eq. 5. Now ZAdd and XAdd have a common factor of Z 2 . ZAdd = (X1 × Z2 + X2 × Z1 )2 = (X1 + X2 )2 × Z 2 XAdd = x × ZAdd + (X1 × Z2 ) × (X2 × Z1 )
(5)
= x × ZAdd + (X1 × X2 × Z ) 2
Due to the property of the projective coordinate system, we can divide ZAdd and XAdd by the common factor of Z 2 . The comparison of the original equation and the modified equation is shown in Table 1. Note that the new formula of the Adding Algorithm is independent of the previous Z-coordinate value. In Doubling Algorithm, there is no such reduction since it deals with only one elliptic curve point. Nevertheless, we can simplify the Doubling Algorithm by noticing that T12 + X 2 ≡ (T1 + X)2 at the steps of 6, 7 and 8 in Fig. 2. One field multiplication can be reduced using this mathematical equality. The Eq. 2 is re-represented in Eq. 6 where c2 = b. ZDouble = (X2 × Z)2 XDouble =
(6)
+c×Z )
(X22
2 2
Note that the resulting Z-coordinate values are different between Adding and Doubling formulae. In order to maintain a common Z-coordinate value, some extra steps similar to Eq. 4 are required. These extra steps must follow every pair of Adding and Doubling Algorithm. The final mathematical expression and its algorithm are shown in Eq. 7 and Fig. 5 respectively. 0005 0004 X1 ← XAdd ZDouble = x(X1 + X2 )2 + X1 X2 (X2 Z)2 ; P 1 ← P 1 + P 2 X2 ← XDouble ZAdd = (X22 + cZ 2 )2 (X1 + X2 )2 Z ← ZAdd ZDouble = (X1 + X2 ) (X2 Z) 2
2
;P2 ← 2 × P2
(7)
; The new common Z-coordinate
In Fig. 5 we mark with (T1 ) at each square operation to indicate that the T1 register is free to store some other value. The reason for this will be obvious in the next section. The comparison of the amount of field operations between L´opez-Dahab’s algorithm and our algorithm is shown in Table 2. Noting that the multiplication and the square are equivalent in the MALU operation, the workload of our algorithm is the same as that of L´ opez-Dahab’s algorithm and we still reduce one register.
A Compact Architecture for Montgomery Elliptic Curve
Adding Algorithm 1. T2 ← X1 + X2 2. T2 ← T22 (T1 ) 3. T1 ← X1 × X2 4. X1 ← x 5. X1 ← T2 × X1 6. X1 ← X1 + T1
Doubling Algorithm 1. X2 ← X22 (T1 ) 2. Z ← Z 2 (T1 ) 3. T1 ← c 4. T1 ← Z × T1 5. Z ← Z × X2 6. X2 ← X2 + T1 7. X2 ← X22 (T1 )
1. 2. 3.
121
Extra Steps X 1 ← X1 × Z X2 ← X2 × T2 Z ← Z × T2
Fig. 5. Proposing Adding and Doubling Algorithms Table 2. Comparison of the computational workload Field Operation Multiplication Square Addition
4 4.1
L´ opez-Dahab’s algorithm 6 5 3
Our algorithm 7 4 3
Proposing System Architecture Arithmetic Logic Unit (ALU) Architecture
The ALU architecture in Fig. 6 is similar to MALU in Fig. 3. The only difference is in the placement of the registers and the control outside the ALU block. Therefore, the ALU block is equivalent to an array of cells in Fig. 3. The reason we separate the registers from the ALU block is for the reuse of the registers. Note that at the completion of the multiplication or addition operation, only the register Reg1 is updated while the registers Reg2 and Reg3 are remained as the beginning of the operations. Therefore, Reg2 and Reg3 can be used not only to store field operands but also to store some values of the proposed Adding and Doubling algorithm where we need five registers for X1 , X2 , Z, T1 , and T2 in Fig. 5.
X]ZG
jGXG
hs G X]ZG
jXG
X]ZG
G tzkG
yGXG
yGYG
yGZ
ccGG jGYG
Fig. 6. ALU Architecture
122
Y.K. Lee and I. Verbauwhede
A care should be taken at this point since the same value must be placed in the both of Reg2 and Reg3 for squaring. Therefore, during squaring, only one register can be reused. This fact would conflict with our purpose to reuse each of Reg2 and Reg3 as a storage of the adding and doubling algorithm. Fortunately, it is possible to free one of the registers to hold another value during squaring. As shown in Fig. 5, T1 can be reused whenever a square operation is required. In Fig. 6, the control line Ctl1 signals the command (multiplication or addition) and the last iteration of the multiplication. When ALU performs a multiplication, each digit of d bits of Reg2 must be entered into ALU in order. Instead of addressing each digit of the 163 bit word, the most significant digit (MSD) is entered and a circular shift of d bits is performed. The shift operation must be circular and the last shift must be the remainder of 163/d since the value must be kept as the initial value at the end of the operation. During performing the ALU operation, an intermediate result is stored in Reg1. Reg1, Reg2 and Reg3 are comparable with C, A and B in Fig. 3 respectively. 4.2
Circular Shift Register File Architecture
By reusing the registers, we reduce two of the registers in the previous subSection. This causes that all the registers should be organized in single register file. Therefore, the register file of our system consists of six registers. In our register file architecture, we use a circular shift register file with a minimum number of operations. The multiplexer complexity of a randomly accessible register file increases as the square of the number of registers. On the other hand, since the multiplexer complexity of a circular shift register file is a constant, this model effectively reduces the total gate area. The operations defined in Fig. 7 are the minimum operations such that any replacement or reordering of the register values can be achieved. Since only Reg1 gets multiple inputs, only one multiplexer of fixed size is necessary. Note that Reg1, Reg2 and Reg3 in Fig. 7 are the three registers which are connected to the ALU in Fig. 6. The assignment operation loads the constants of elliptic curve parameters into Reg1 which is the only register to be assigned a constant value. The shift operation shifts the register values in circular and the switch operation switches the values of Reg1 and Reg2. The copy operation replaces the value of Reg1 with Reg2. Note that the copy operation is required for the field square operation which is implemented as the field multiplication with two operands of the same value. 4.3
Overall System Architecture
The overall system architecture is shown in Fig. 8. Elliptic curve point add and doubler (EC Add&Doubler) consists of Control 1, ALU and the register file. Control 1 includes the hard-wired elliptic curve parameters and manages the register file. Control 2 detects the first bit of 1 in Key (or a scalar) and controls EC Add&Doubler depending on the Key values of the later bits according to the Montgomery algorithm in Fig. 1. Key and Tester are placed outside Montgomery
A Compact Architecture for Montgomery Elliptic Curve
yGXG
yGYG
yGZG
yG[G
yGG
yG]
yGG
yG]
yGG
yG]G
yGG
yG]
123
yGXG
yGYG
yGZG
yG[G
<Shift Operation>
yGXG
yGYG
yGZG
yG[G
<Switch Operation>
yGXG
yGYG
yGZG
yG[G
? [2 2]
FI31 [9 12]
b001b h (( (h (h h
FI32
[6 8]
b001b h (( (h (h h
FI33
b001b h (( (h (h h [9 12] ? (7) 0002L7 Δ
Z3
=0x6d
[2 2]
?
S9 [4] b001b
h (( (h (h h
- S7 [6] b001b h (( (h (h h S9 001b [8] bh
(( (h (h h [6] [8] ?
[3 3]
?
S9 [6] b001b
[3 3]
001b
h (( (h (h h
S7 [9] b001b
h (( (h (h h
S9 001b [12]bh
(( (h (h h [9] [12] ? 6
? [2 2]
FI31 [9 12]
b001b hh (( (h ( h
FI32
[9 12]
b001b hh (( (h ( h
FI33 [8 11]
b001b hh (( (h ( h ? 0002L7
Δ(7) Z3
=unknown
Fig. 4. Formal analysis of FO3
eighth order differential. Moreover, we consider the case of NOT fixed KL21 . The plaintext sub-blocks X2 and X3 affect the maximum degree of output because they are inputted to FI21 . Therefore we cannot use them as variable sub-blocks. Thus we consider an eighth order differential using X0 and one bit selected from among X1 . We confirmed that Δ(8) Z30003 L7 =0 for such an eighth order differential by computer simulation although we have to omit the details of the simulation due to the limitted space. The attack equation derived from the equation Δ(8) Z30003 L7 = 0 has unknowns with respect to sub-keys for FL6 and FO5 . However, if we fix the value of the secret key sub-block as K7 = KL62 =0xffff, we have Δ(8) Z3L7 = 0. We can neglect the unknowns with respect to the sub-keys for FL6 .
220
H. Tanaka et al. B
Δ(8) Z3L7 =0 C -FO5 - b A h (( (h (h h -FO6 - b h h( h( (( h FL7
FL8
CL
CR
?
?
Fig. 5. Relation of the variables A, B and C in equation (3.4)
Our attack succeeds under the condition that the secret key sub-blocks satisfy K5 = K7 =0xffff. inter.tex 3.2
Attack Equation
In general, an attack equation is derived by calculating a higher order differential from the ciphertext side step by step. So, the attacker starts to derive an equation to solve the sub-keys in the last round. Then he derives a new equation to determine the next level unknown sub-keys using the information already obtained. So it is necessary for attacker to solve many attack equations. Contrary to such a general approach, we employ a different method, that is, we derive an attack equation for a secret key directly. So it suffices for us to construct an attack equation only once, and solve it. We apply a method in [7] to solve the attack equation. In this paper, we call it an algebraic method. The algebraic method was proposed by Moriai, Shimoyama and Kaneko, to solve KN cipher and CAST. This method solves higher order equations by regarding them as linear equations: basically, every product of (more than two) variables is identified to a new variables, and the resulting equation has order 1, that is, every term of the resulting equation contains only one variables. Let us demonstrate an example of higher order equation with three variables (x2 , x1 , x0 ) and the equation transformed. The equation a0 x0 ⊕ a1 x1 ⊕ a2 x0 x2 = 1 is transformed to a0 x0 ⊕ a1 x1 ⊕ a2 x3 = 1, where x3 is a new variable which is identified to x0 x2 . We should note that the resulting equation is linear, that is, every term contains at most one variable. So we can regard the equation as a linear equation which contains at most 3 C1 + 3 C2 + 3 C3 = 7 new variables if the original equation contains three variables. Since the higher order equation with three variables is defined as the linear equation with seven variables by this procedure, to determine variables, we need more linear independent equations than equations required by a brute force search. However, the computational cost to solve linear equations is much smaller than the cost required by a brute force search. The reader is referred to [7] for the precise definition and detailed discussion of the algebraic method. The number of the independent unknown terms turns out to be large if the number of secret key sub-blocks inputted to S-boxes is large. On the other hand,
Security Analysis of MISTY1
221
it is hard to solve a system of equations with a large number of independent unknown terms. Therefore, we need to keep the number of secret key sub-blocks inputted to S-boxes to a minimum in order to make the attack equation easily solvable. Following such a strategy, we construct an attack equation with a small number of independent unknown terms, and then present an attack algorithm against a six round MISTY1 using 2-round elimination method that estimates a part of the secret key sub-blocks with a brute force search. Let us see the construction of our attack equation as follows. First of all, the following attack equation can be derived using 2-round elimination method. Δ(8) Z3L7 = Δ(8) {A + B} = 0, ⎧ ⎪ A = FL8 (CR ; KL8 )L7 ⎪ ⎪ ⎪ ⎪ = FL8 (CR ; K60003 , K8 )L7 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ B = FO5 (C; KO5 , KI5 )L7 ⎪ ⎪ = FO5 (C; K1 , K20003 , K4 , K5 , K60003 , K7 , K80003 )L7 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ C = FL7 (CL ; KL7 ) + FO6 (FL8 (CR ; K60003 , K8 ); KO6 , KI6 ) ⎪ ⎩ = FL7 (CL ; K20003 , K4 ) + FO6 (FL8 (CR ; K60003 , K8 ); K10003 , K2 , K30003 , K5 , K70003 , K8 ) (3.4) The relation of the variables A, B and C is shown in Figure 5. Next, we consider the optimal combination which divides unknowns into the group determined by an algebraic method and the group determined by a brute force search. Since K5 and K7 are fixed, we have 6 secret key sub-blocks (total: 96 bits) as unknowns. Since the number of terms inputted to the S-boxes becomes the maximum, it is difficult to solve KO61 , KO62 and KI61 by using the algebraic method. Thus we use a brute force search to estimate the secret key sub-blocks, K6 (= KO61 ), K8 (= KO62 ) and K30003 (= KI61 ). Since K70003 (= FI(K7 ; K8 )), we can treat the terms with respect to KI62 = K70003 as known values. Consequently, K5 , K7 = fixed, K30003 , K6 , K8 = estimated by brute force search, K70003 = calculated,
(3.5)
and we have K1 , K10003 , K2 , K20003 and K4 as unknowns. Note that K1 = KO54 is a constant term in the attack equation. Thus the terms with respect to K1 do not exist in the eighth order differential. Thus we can omit terms with respect to K1 and determine K10003 , K2 , K20003 and K4 (total 64 bit) by using the algebraic method. Consequently, our attack equation consists of equation (3.4) and (3.5). By solving the attack equation, we can determine secret key sub-blocks K10003 , K2 , K20003 , K30003 , K4 , K5 , K6 , K7 and K8 . Since K10003 = FI(K1 ; K2 ) and K30003 = FI(K3 ; K4 ), secret key sub-block K1 and K3 can be determined by using (K10003 , K2 ), or (K30003 , K4 ).
222
3.3
H. Tanaka et al.
Complexity
Let us assume that an attack equation derived from the value of the n-th order differential of the b-bit sub-block among the output. Let L be the number of independent unknowns in the attack equation. To determine all the unknowns, we need at least L linear equations. Since we can derive b linear equations from one n-th order differentials, we need 0004L/b0005 different n-th order differential to derive the L × L coefficient matrix. If we use the same techniques shown in [7] to derive the matrix, we need to calculate the F function operation using L different temporary sub-keys. Thus we need 2n × 0004L/b0005 chosen plaintexts and 2n × 0004L/b0005 × L F function operations (computational cost). After deriving the coefficient matrix, we can solve the equation by Gauss - Jordan elimination. In general, the cost required by Gauss - Jordan elimination can be negligible comparing with the cost required to calculate theL × L coefficient matrix. We consider the complexity to solve the attack equation by estimating the other unknown S (s bits). If we solve the attack equation using L + α linear equations, the equation holds a false value of S, with probability 2−α . Therefore, if we have additional α linear equations for such that 2s−α << 1, we are able to eliminate all false value of S. Thus we need 2n × 0004(L + α)/a0005 chosen plaintexts and 2n+s × 0004(L + α)/a0005 × L computational cost. We counted the number of independent unknown terms appeared in the attack equation (3.4) and (3.5) by computer simulation. Although the details are omitted, we found 13,269 independent unknown terms. Since we estimated an s = 48 bit unknown with the brute force search, we set α = 64 (248−64 = 2−16 << 1). And since we derive the attack equation (3.4) focusing on 7-bit output sub-block Z3L7 , b is equal to 7. Consequently, we estimate that 28 ×0004(13269 + 64)/70005 0006 218.9 chosen plaintexts and 28+48 × 0004(13269 + 64)/70005 × 13269 0006 280.6 computational cost are needed to determine a 128-bit secret key. In this section, we attacked 6 round MISTY1 with FL functions, and succeeded in attacking because we could make the complexity for solving the attack equation relatively small (280.6 ). There are two reasons that we could make it. First, the key schedule for MISTY1 makes only 16 combinations of sub-keys. Second, the relation between 16 combinations of secret key sub-blocks is relatively simple because the key schedule is simple. MISTY1 has high performance because of its simple key schedule whereas our result suggests its key schedule might be a weakness in its security.
4 4.1
Low Order of S-Box Basic Idea
In this section, we analyze MISTY1 with no FL functions (no FL version). We omit the key schedule in the following, because it is not defined for no FL version. The structure of no FL version is shown in Figure 6. We uses equivalent functions and sub-keys to simplify the attack algorithm. Our goal is to know the maximum attackable number of rounds for the no FL version using property 1.
Security Analysis of MISTY1
P
Equivalent FO
?
b b001bK K1 FO1 - b 2
?
Equivalent FI
? b001bKij1
h (( (h (h h -FO2 - b h (( (h (h h -FO3 - b h h( h( (( h -FO4 - b
FIi1
S9
FIi2
S7
· ·
FIi3
S9
· ·
h (( (h (h h ? C
b001b hh (( ((h h b001b h (( (h (h h
b001b h (( (h (h h ?
223
b001b hh (( ((h h b001bKij2 b001b h (( (h (h h b001bKij3 b001b h (( (h (h h ?
Fig. 6. MISTY1 with no FL functions
We consider an attack using the chosen plaintext P which has the following form : P = (V PR ), V = (0, 0, 0, 0, 0, 0, 0, X8)
(4.1)
where “0” denotes a fixed plaintext sub-block. In this case, since the variable sub-block is inputted to the first round, Δ(7) Z2L7 =0x6d holds from the equation (2.3). However, if the attacker can select the value of PR which satisfies Z1 ⊕ PR = constant,
(4.2)
he can get Δ(7) Z4L7 =0x6d. Let us discuss the details of FO1 . The structure of FO1 is shown in Figure 7. Since the output of FI11 and the outputs of the first S9 in FI12 and FI13 are constants, the value of the output of FO1 is determined by the value of equivalent sub-keys K122 , K123 , K132 and K133 (total: 32 bits). Let us consider the technique to adjust the value of PR using the equation PR = FO(V ; K)
(4.3)
where K denotes a 32 bit variable corresponding to equivalent sub-keys K122 , K123 , K132 and K133 . The other equivalent sub-keys in FO are fixed to some constant value (for example, all zeros). As a result, equation (4.2) holds with probability 2−32 . If the attacker has all the value of K, he can get Δ(7) Z4L7 =0x6d. 4.2
Attack Equation
From the discussion above, Δ(7) Z4L7 =0x6d holds with probability 2−32 . Thus we can derive the following attack equation by calculating the seventh order differential from the ciphertext side.
224
H. Tanaka et al.
? V
b001bK
? V
1j1
FI11
S9
b001b h (( (h (h h
FI12
b001b h (( (h (h h
FI13
b001b h (( (h (h h ?
-
b001b h (( (h (h h b001bK1j2
S7
b001b h (( (h (h h b001bK1j3
S9
b001b h (( (h (h h ?
j = 2or3
Fig. 7. Structure of FO1
Δ(7) {FO6 (CL ; K611 , K612 , K621 , K622 , ) + CR }L7 = 0x6d
(4.4)
We solve the equation using the algebraic method. Although the details are omitted, we found that the number of independent unknown terms appeared in the attack equation (4.4) is 74 by computer simulation. We use a brute force search for K. If K satisfies equation (4.2), the attack equation (4.4) holds and we can determine the unknown terms. But if it does not satisfy equation (4.2), the attack equation does not hold. In the same way as in Section 3.3, we set α = 48 (232−48 = 2−16 << 1) to solve the equation. And since we derive the attack equation (4.4) focusing on 7-bit output sub-block Z4L7 , b is equal to 7. As a result, we need 27 × 0004(74 + 48)/70005 0006 211.2 chosen plaintexts and 27+32 × 0004(74 + 48)/70005 × 74 0006 249.4 computational cost. This computational cost is much smaller than 2128 needed for brute force search for secret key. Therefore a 2-round elimination method using a brute force search for the sub-key in the seventh round (total: 75 bits) works. 4.3
Complexity
Since we determine a 107-bit unknown (a 32-bit K and 75-bit sub-key in the seventh round), we set α = 123 (2107−123 = 2−16 << 1). Thus we need 27 × 0004(74 + 123)/70005 0006 211.9 chosen plaintexts and 27+32+75 × 000474 + 123/70005 × 74 0006 2125.1 computational cost to attack a seven round MISTY1 with no FL functions. In this section, we attacked 7 round MISTY1 with no FL functions, and succeeded in attacking because our technique enable us to raise the number of rounds. The computational cost can be divided into three tasks: 2-round elimination, adjusting the value of PR , and matrix computation. Since S-boxes has low order, the order of the attack equation turns out to be low. Then the number of the
Security Analysis of MISTY1
225
independent unknown terms is small. This makes the cost for the matrix computation small. Then we can use our computation resource to the other tasks, that is, computation for the 2-round elimination and adjusting the value of PR . Our technique enables us to determine the output of FO1 for the input V using only 32-bit sub-key satisfying (4.2). Note that this is a big improvement because 75-bit K satisfying (4.2) is necessary to determine the output of FO1 unless a certain trick is employed. Because the cost for matrix computation is kept small, we have more computation cost available and then the reduction from 75-bit to 32-bit helps to raise the number of rounds, that is, we could attack seven round MISTY1 with no FL. Table 2. Comparison of results #rounds #chosen plaintexts comput. cost method 34 48 MISTY1 5 2 2 integral [2] with FL 6 218.9 280.9 Section 3 54 61 MISTY1 6 2 2 impossible differential [4,5] with no FL 7 239 2125.1 Section 4
5
Conclusions
First, we showed the attack of a six round MISTY1 with FL functions. This attack algorithm takes advantage of the simplicity of the key schedule. We gave the condition for the secret key to satisfy Δ(8) Z3L7 = 0. And we also showed the technique to reduce the complexity to solve the attack equation using the simple relation among sub-keys. Consequently, we showed the attack equation for a six round MISTY1 with FL functions and the effective algorithm to determine a 128-bit secret key. Our algorithm needs 218.9 chosen plaintexts and 280.6 computational cost. Since our target, reduced round MISTY1, has FL functions in the last round, our algorithm is more realistic and powerful than the existing methods. Although it may be unfair to compare only the number of rounds and the number of chosen plaintexts, our algorithm can attack more rounds using less number of chosen plaintexts than Knudsen and Wagner’s integral cryptanalysis. We also succeeded in attacking MISTY1 with no FL functions that has more rounds than the model attacked previously. We presented a novel technique to control the value of fixed part of plaintext. Since the order of S-boxes is very small and unknown terms are very few, the algebraic method works very efficiently. The 2-round elimination method works because the complexity needed by the algebraic method in the procedure of solving the attack equation is very small. Our algorithm needs 211.9 chosen plaintexts and 2125.1 computational cost, and is superior to the previous ones, in the sense that it can attack more rounds with fewer chosen plaintexts. We summarize our results and the related works in table 2.
226
H. Tanaka et al.
Acknowledgements We would like to thank Ms. Makiko Shirota for her contribution to the computer simulations.
References 1. Babbage, S., Frisch, L.: On MISTY1 Higher Order Differential Cryptanalysis. In: Won, D. (ed.) ICISC 2000. LNCS, vol. 2015, pp. 22–36. Springer, Heidelberg (2001) 2. Knudsenand, L.R., Wagner, D.: Integral Cryptanalysis. In: Daemen, J., Rijmen, V. (eds.) FSE 2002. LNCS, vol. 2365, pp. 114–129. Springer, Heidelberg (2002) 3. Jakobsen, T., Knudsen, L.R.: The Interpolation Attack on Block Cipher. In: Preneel, B. (ed.) Fast Software Encryption. LNCS, vol. 1008, pp. 28–40. Springer, Heidelberg (1995) 4. K¨ uhn, U.: Cryptanalysis of Reduced-Round MISTY. In: Pfitzmann, B. (ed.) EUROCRYPT 2001. LNCS, vol. 2045, pp. 325–339. Springer, Heidelberg (2001) 5. K¨ uhn, U.: Improved Cryptanalysis of MISTY1. In: Daemen, J., Rijmen, V. (eds.) FSE 2002. LNCS, vol. 2365, pp. 61–75. Springer, Heidelberg (2002) 6. Matsui, M.: New block encryption algorithm MISTY. In: Biham, E. (ed.) FSE 1997. LNCS, vol. 1267, pp. 54–68. Springer, Heidelberg (1997) 7. Moriai, S., Shimoyama, T., Kaneko, T.: Higher Order Differential Attack of a CAST Cipher. In: Vaudenay, S. (ed.) FSE 1998. LNCS, vol. 1372, pp. 17–31. Springer, Heidelberg (1998) 8. NESSIE, https://www.cosic.esat.kuleuven.ac.be/nessie/ 9. RFC2994, http://www.faqs.org/rfcs/rfc2994.html 10. 3GPP, http://www.3gpp.org/
A Generic Method for Secure SBox Implementation Emmanuel Prouff2 and Matthieu Rivain1,2 1
University of Luxembourg Faculty of Sciences, Technology and Communication 6, rue Richard Coudenhove-Kalergi L-1359 Luxembourg 2 Oberthur Card Systems, 71-73 rue des Hautes Pˆ atures, 92726 Nanterre Cedex, France {m.rivain,e.prouff}@oberthurcs.com
Abstract. Cryptographic algorithms embedded in low resource devices are vulnerable to side channel attacks. Since their introduction in 1996, the effectiveness of these attacks has been highly improved and many countermeasures have been invalidated. It was especially true for countermeasures whose security was based on heuristics and experiments. Consequently, there is not only a need for designing new and various countermeasures, but it is also necessary to prove the security of the new proposals in formal models. In this paper we provide a simple method for securing the software implementation of functions called SBoxes that are widely used in symmetric cryptosystems. The main advantage of the proposed solution is that it does not require any RAM allocation. We analyze its efficiency and we compare it with other well-known countermeasures. Moreover, we use a recently introduced proof-of-security framework to demonstrate the resistance of our countermeasure from the viewpoint of Differential Power Analysis. Finally, we apply our method to protect the AES implementation and we show that the performances are suitable for practical implementations.
1
Introduction and Motivations
Side Channel Analysis are powerful attacks that utilize side channel leakage of embedded devices such as timing execution or power consumption to obtain information about secret data. It essentially allows two kinds of Power Attacks: the Simple Power Analysis (SPA) and the Differential Power Analysis (DPA). SPA consists in directly interpreting power consumption measurements and in identifying the execution sequence. In a DPA, the attacker focuses on the power consumption of a single instruction and performs statistical tests to reveal some correlations between the distribution of the measurement values and the sensitive data (i.e. depending on a secret value) manipulated by the instruction. Since the publication of the first DPA, many papers describing either countermeasures or attack improvements have been published (see [1,3,4,12] for example). Among S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 227–244, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
228
E. Prouff and M. Rivain
these improvements, Higher order DPA attacks (HODPA) are of particular interest. They extend the DPA by considering a set of several instructions instead of a single one. The number d of instructions targeted by the attack is called order of the DPA. The most common way of thwarting DPA involves random values (called masks) to de-correlate the leakage signal from the sensitive data which are manipulated [4, 6, 1]. If every sensitive variable is protected with a single mask, the implementation may thwart first order DPA but it can be theoretically attacked by a second order DPA targeting both the mask and the masked value. To thwart second order DPA (and more generally d-th order DPA), every sensitive variable must be masked with 2 (resp. d) random values. This implies that the timing-memory cost of the existing masking countermeasures increases greatly with the order of the DPA they aim to counteract. For applications where time constraints are very strong (such as contact-less applications), it may be considered as sufficient to mask all the sensitive data with a small number of random values generated at the beginning of the algorithm execution. The performances of many DPA countermeasures have been studied in this model. However recent results (see [13]) showed that second order DPA represents a real practical threat, especially when the same value is used to mask several intermediate results throughout the algorithm. Therefore, as noticed in [5, 15], the masks must be re-generated as often as possible and the analysis of the efficiency of a countermeasure has to take this fact into account. In this model, it appears that only a few among the existing countermeasures are still efficient in a low resource context and there is therefore still a need for investigating new and various countermeasures that thwart first order DPA efficiently when the masks change frequently. Due to the very large variety of Side Channel Attacks reported against cryptosystems and devices, sensitive applications (e.g. Banking, GSM or Identity Card) cannot make use of countermeasures with ad hoc security but need countermeasures which are provably secure against a precisely modeled adversary. Recently, new notions and tools have been introduced to evaluate the security of an implementation against Side Channel Attacks [20, 18, 17]. They allow for the formal validation or the formal invalidation of the resistance of a countermeasure under some realistic assumptions on the behavior of the device and on the power of the adversary. In this paper, we focus on DPA against block cipher algorithms. The most critical part when securing implementations of such algorithms against DPA is to protect their non-linear operations (i.e. the calls to the SBoxes). In the next section, we recall the methods which have been proposed in the Literature. Then, we introduce in Sect. 3 a new and simple countermeasure which counteracts first order DPA against SBox implementations whatever the algebraic structure of the SBox. When the masks change frequently, we argue that the new method has a good efficiency compared to the other generic methods. In Sect. 4, we analyze the security of our proposal by following the methodology described in [20]. In Appendix A, we apply our method to protect the implementation of
A Generic Method for Secure SBox Implementation
229
the AES SBox and we compare its efficiency and its security with the ones of other existing countermeasures.
2 2.1
Secure Implementation of Non-linear Functions in the Literature State of the Art of the Generic Methods
To counteract DPA attacks, one usually tries to make the power consumption signal as independent as possible of the sensitive data manipulated by the algorithm. Goubin and Patarin proposed in [6] a general solution, called duplication method, to protect the implementation of an algorithm. In this approach, every sensitive variable x is split into d blocks r1 , .., rd and every cryptographic primitive F that manipulates x is associated to d functions F1 , .., Fd and to a simple transformation σ (e.g. a simple bitwise addition) such that F (x) = σ(F1 [r1 , .., rd ], · · · , Fd [r1 , .., rd ]). Goubin and Patarin showed that an implementation protected by duplication method thwarts first order DPA if for every x (resp. every F [x]) the d blocks r1 , .., rd (resp. F1 [r1 , .., rd ], .. , Fd [r1 , .., rd ]) are never manipulated at the same time. Another approach consists in masking all the sensitive internal data with random values. Depending on the kind of operations performed by the linear parts of the algorithm, the mask values are introduced by modular addition, bitwise addition or multiplication. After selecting the masking operation 0003, the implementation of a cryptographic function F is rendered resistant to first order DPA by solving the following problem: Problem 1. Knowing x 0003 r, r and s, compute F (x) 0003 s such that every value of the power consumption signal is independent of x. If the function F is linear for the law 0003, then solving the above problem is a simple task. Indeed, since F (x0003r) equals F (x)0003F (r), we have F (x)0003s = F (x0003r)0003F (r)0003s. If F is non-linear for the law 0003 (which is the case when F is a SBox), then designing an implementation solving Problem 1 is much more difficult. Several kinds of methods have been proposed in the Literature and we recall two of them hereafter. The first one, called re-computation method [1, 11], involves the computation of a table corresponding to the masked SBox and the generation of one or several random value(s). In its most elementary version, two random values r and s are generated and the table T 0002 representing the function F 0002 : x 0002→ F (x 0003 r) 0003 s is computed from F and stored in the RAM of the device. Then, each time the masked value F [x] 0003 s has to be computed from the masked input x 0003 r, the table T 0002 is accessed. For such a method, the number of tables to be recomputed during the execution of the algorithm equals the number of different input/output masks which is allowed. Remark 1. The re-computation method is a particular case of the duplication method where d equals 2, where the sensitive value x is split into r1 = x 0003 r
230
E. Prouff and M. Rivain
and r2 = r and where the functions F1 and F2 equal r1 0002→ S[r1 ] 0003 s and r2 0002→ s respectively. The second kind of methods, that we call here SBox secure calculation, has been essentially applied to protect AES implementations [4, 7, 18, 19, 21] due to the strong algebraic structure of the AES SBox. The SBox outputs are not directly obtained by accessing a table but are computed on-the-fly by using a mathematical representation F of the SBox. Then, each time the masked value F [x] 0003 s must be computed from the pair (x 0003 r, r), an algorithm performing F and parameterized by the 3-tuple (x 0003 r, r, s) is executed. The computation of F is split into elementary operations (bitwise addition, bitwise multiplication, addition, multiplication, ..) and/or is performed in spaces of small dimensions (e.g. 4) by accessing one or several look-up table(s) (see [15]). The security of such a method is achieved by protecting each elementary operation and each memory transfer. When the same pair of input/output masks is used throughout the algorithm, the latter is said to be protected in the single-mask protection mode. In such a mode, a new pair of input/output masks (r, s) is generated at each execution of the algorithm and every computation F (x) performed during the execution is protected with this single pair. When the algorithm is protected in this mode, SBox secure calculation methods are often much more costly than the re-computation methods since they essentially replace a single access to a table by numerous logical operations and memory transfers. This difference between the performances of the two methods decreases when the number of different masks generated to protect the SBox calculations increases. In the multi-mask protection mode, the pair of masks (r, s) is re-generated each time a calculation F (x) must be protected (and thus many times per algorithm execution). In such a context, the SBox secure calculation methods become more appropriate and induce a smaller timing/memory overhead than the re-computation methods. Indeed, when re-computation methods are used in the multi-mask protection mode, a new table must be computed from F after each re-generation of masks (i.e. before every computation F (x)). As discussed in the previous paragraph, the choice between the first and the second category of methods highly depends on the protection mode, single-mask or multi-mask, in which the algorithm is implemented. We compare the two modes in the next section. 2.2
Single-Mask Protection Mode versus Multi-mask Protection Mode
When it is only required to thwart first order DPA, then implementing the algorithm in the single mask protection mode is sufficient. However recent results (e.g. [13]) show that second order DPA can represent a real practical threat when the amount of information leaking in the two consumption points targeted by the attacker is sufficiently high to make the effects of the de-synchronization and of the noise negligible. More generally, the analyses of second order DPA
A Generic Method for Secure SBox Implementation
231
published in [8,13,23] illustrate that the complexities of the various second order DPA are very different and show that some of them must be considered when implementing an algorithm for secure applications. In fact, as already put forward in [5,15], second order DPA are especially effective when the same pair of masks is used to protect all the inputs/outputs of the cryptographic primitives (e.g. all the inputs/outputs (x, F (x)) of the SBoxes) involved in the algorithm. Indeed, the beginning and the end dates of the execution of these primitives are quite easy to localize in the power consumption curves. Consequently, when all the inputs (resp. all the outputs) of the primitives are masked with the same value, then an attacker can precisely isolate two consumption points manipulating the same masks and is therefore able to unmask a sensitive data. A straightforward solution to make this particular class of second order DPA difficult to perform in practice consists in re-generating the masks as often as possible. Even if such a solution does not ensure that the algorithm thwarts all kinds of second order DPA, it allows to counteract those among the most efficient ones. For the reasons detailed above, we think that there is a practical interest to introduce an intermediate resistance level between the first order DPA-resistance, in which every first order DPA is counteracted, and the second order DPAresistance, in which every second order DPA is also counteracted. We call this intermediate level, first order DPA-resistance in the multi-mask protection mode. In this new level, the pair of masks (r, s) is re-generated each time a calculation F (x) must be protected (and thus many times per algorithm execution). Since the implementation of an algorithm perfectly thwarting second-order DPA requires a great timing and memory overhead and since an implementation thwarting only first order DPA does no longer provide enough security, we think that the first order DPA-resistance in the multi-mask protection mode nowadays offers the better security/efficiency trade-off. Analyzing the security of an implementation of an SBox for the new resistance level is equivalent to study the first order DPA resistance of the SBox implementation. The difference appears when it comes to investigate the efficiency of the countermeasure. For instance, a technic efficient in the single-mask mode (e.g. the re-computation method) can become much more costly in the multi-mask mode. In the next section, we present a simple SBox secure calculation method which resolves Problem 1 and we compare its performances with other generic methods in the multi-mask protection mode.
3 3.1
The New S-Box Secure Calculation Method Our Proposal
Let x denote a sensitive variable, let r and s be an input mask and an output mask and let F denote an SBox function mapping Fn2 into Fm 2 . The core idea of our proposal is to compute F ((x ⊕ r) ⊕ a) ⊕ s for every value a, storing the result in a register R0 if a equals r and in a second register R1 otherwise.
232
E. Prouff and M. Rivain
Let compare : x, y 0002→ compare(x, y) be the function returning 0 if x = y and 1 otherwise, we depict our proposal in the following algorithm. Algorithm 1. Computation of a masked S-Box output from a masked input Input: a masked value x ˜ = x ⊕ r, an input mask r, an output mask s, a look-up table for F Output: the masked S-Box output F (x) ⊕ s 1. for a = 0 to 2n − 1 do 2. cmp ← compare(a, r) x ⊕ a) ⊕ s 3. Rcmp ← F (˜ 4. return R0
Remark 2. Many microprocessors implement the function compare by a single instruction. Thus, we will assume in the rest of the paper that this function is an elementary operation. To verify the correctness of Algorithm 1., it can be checked that Step 3 performs the following operation: 0002 R0 ← F (˜ x ⊕ a) ⊕ s if a = r , (1) R1 ← F (˜ x ⊕ a) ⊕ s otherwise . x ⊕r)⊕s = F (x)⊕s when the loop is completed. Hence, R0 contains the value F (˜ The security of the new method highly depends on the assumption that the leakage generated by a register transfer is the same whatever is the register. This assumption is commonly accepted and it is the security core of SPA countermeasures used to protect asymmetric cryptosystems (see for instance [9]). As every variable manipulated during the execution of the algorithm is masked by a random value, it can be proven (in a similar way as done in Sect. 4) that it thwarts DPA in both Hamming Weight and Hamming Distance models. Nevertheless, Algorithm 1. has a potential security weakness because it involves dummy operations (i.e. operations which do not impact the value returned by the algorithm)3 . This flaw could be exploited by an attacker who would disrupt the Step 3 operation (for instance by fault injection [2]) during a chosen loop iteration a = a0 . By checking if Algorithm 1. output is erroneous or not, the attacker would be able to detect when the mask r equals the known loop index a0 . Finally, for the power consumption measurements corresponding to the cases r = a0 , the attacker would be able to unmask x˜ and to perform a classical first order DPA. To circumvent the flaw, dummy operations must be avoided. When F is balanced 4 , i.e. when #F −1 (y) equals 2n−m for every y ∈ Fm 2 , then we propose in 3 4
Attacks exploiting dummy operations have been mainly applied to attack SPAresistant implementations of RSA (see [24] for an example of such attacks). For security reasons, functions F involved in cryptographic applications are always balanced.
A Generic Method for Secure SBox Implementation
233
Algorithm 2. Computation of a masked S-Box output from a masked input Input: a masked value x ˜ = x ⊕ r, an input mask r, an output mask s, a look-up table for F Output: the masked S-Box output F (x) ⊕ s 1. 2. 3. 4. 5. 6. 7.
R0 ← s R1 ← s for a = 0 to 2n − 1 do cmp ← compare(a, r) x ⊕ a) Rcmp ← Rcmp ⊕ F (˜ cmp ← compare(R0 , R1 ) return R0 ⊕ (cmp × R1 )
the following a slightly modified version of Algorithm 1. where both registers R0 and R1 are involved in the computation of the output value. According to (1), register R0 contains the value F (x) ⊕ s at the 0003 end of the x⊕ loop. Moreover it can be checked that the content of R1 equals s ⊕ a∈Fn2 F (˜ a0003=r 0003 a). As F is balanced, the summation F (˜ x ⊕ a) equals F (˜ x ⊕ r) that a∈Fn 2 a0003=r 0003 is F (x) since we have x ˜ = x ⊕ r. Indeed, the summation F (x) can be a∈Fn 2 0003 0003 rewritten y∈Fm ( a∈Fn ; F (a)=y y). As F is assumed to be balanced, each term 2 2 0003 y corresponds to 2n−m times the sum of the vector y with itself. a∈Fn 2 ; F (a)=y 0003 Thus, each term a∈Fn ; F (a)=y y equals the null vector if n > m and equals y if 2 0003 0003 n = m. This implies that a∈Fn F (a) equals 0 if n > m and equals y∈Fm y if 2 2 n = m. Since 0003 the sum of all the elements of a space equals the zero vector, one deduces that a∈Fn F (a) is also equal to 0 if n and m are equal. Consequently, 2 x ⊕ r) = F (x) ⊕ s holds at the end when F is balanced, the equality R1 = s ⊕ F (˜ of the loop. Finally, Step 6 aims at verifying that the contents of the two registers R0 and R1 are equal. Then, Step 7 returns either the expected result if no perturbation occurred or an erroneous result otherwise. This simple improvement of Algorithm 1. ensures that the attacker is no longer able to determine when the index a0 of the targeted loop iteration equals the input mask r. Algorithm 1. requires 2n × 3 logical operations (2 x-or operations and 1 comparison per loop iteration) and 2n memory transfers (1 table look-up per loop operation). For Algorithm 2., two assignments (Steps 1 and 2), one comparison (Step 6), one multiplication and one x-or operation (Step 7) are added. Since changing the input and output masks at each execution of Algorithm 2. has no impact on its performances, our proposal is efficient in the multi-mask protection mode. Moreover, it is generic in the sense that it can be applied to any balanced SBox F without any assumption on the algebraic structure of F . In what follows, we compare the performances of our proposal with the ones of other generic methods.
234
3.2
E. Prouff and M. Rivain
Comparison with Other Generic Methods
We focus here on two well-known elementary masked SBox computation methods. The first one is the table re-computation method recalled in Sect. 2. The second one, that we call here the global look-up table method, uses a large look-up table addressed with the mask and the masked value. Re-computation method. Let T denote a 2n bytes table allocated in RAM5 . When the block cipher algorithm is protected in the multi-mask protection mode, a new pair of input/output masks is generated each time an SBox output is computed and the following sequence of operations is performed: 1. for x = 0 to 2n − 1 do 2. T [x] ← F (x ⊕ r) ⊕ s 3. return T [˜ x] This algorithm requires 2n × 2 logical operations (2 x-or operations per loop iteration) and 2n ×2+1 memory transfers (1 read operation and 1 write operation per loop iteration and one access to the re-computed table). The computational cost of the table re-computation method is approximatively the same as for Algorithm 2. However, it also requires the allocation of 2n bytes of RAM which can be problematic in a low resource context, especially when several SBoxes need to be protected. Global look-up table method. Let T 0002 denote the look-up table associated to the function (x, y) 0002→ F (x ⊕ y) ⊕ y. To compute F (x) ⊕ r from x ⊕ r and r, the global look-up table method performs a single operation: the table look-up x, r]. Its timing performances are ideal since it requires only one memory T 0002 [˜ transfer. However, the size 22n of the look-up table T 0002 makes an application of the method difficult in a low resource context. For instance, if n is greater than or equal to 7, the amount of ROM required is definitively too great (at least 16 KB!). When n is lower than or equal to 6, the feasibility of the method depends on the amount of ROM of the device and on the number of different SBoxes which must be protected. The method can become interesting when it comes to protect SBoxes mapping F42 into itself (as it is the case for FOX [10] where three such SBoxes are involved) or when the SBox calculus can be performed in spaces of dimensions lower than or equal to 4 (as it is the case for the AES SBox - see Appendix A -). From a security point of view, the global look-up table method has a flaw since it manipulates the mask r and the masked data x˜ at the same time. Indeed x˜ and r are concatenated to address the look-up table T 0002 and thus, the value x˜ r is transferred through the bus. Since the variables x˜ r and x are statistically dependent, the leakage on x ˜ r is potentially exploitable by a first order DPA. 5
To make the description easier, we assume that every element of Fm 2 is stored on one byte.
A Generic Method for Secure SBox Implementation
235
Table 1. Comparison of methods solving Problem 1 for the bitwise addition Method Table recomp. Table recomp. Global LUT Algo. 2.
Masking Mode Pre-computation SBox calculation RAM ROM Single-masking 2n+1 MT + 2n+1 RLO 1MT 2n 2n n+1 n+1 Multi-masking 0 (2 + 1)MT + 2 RLO 2n 2n Multi-masking 0 1MT 0 22n Multi-masking 0 2n MT + (3 × 2n + 5)RLO 0 2n
Table 1 summarizes the costs of the three previously considered methods according to the number of register logical operations (RLO), the number of memory transfers (MT), the memory size (bytes in RAM) and the code size (bytes in ROM).
4
Security Analysis
To study the security of our proposal we will use some basic notions of information Theory. We recall them in the next section. 4.1
Preliminaries
We use the calligraphic letters, like X , to denote finite sets. The corresponding large letter X is then used to denote a random variable over X , while the lowercase letter x - a particular element from X . The probability of the event (X = x) is denoted P[X = x]. The entropy H(X) of a random variable X aims at measuring the amount of information provided by an observation of X and 0004 entropy of satisfies H(X) = − x∈X P[X = x] log(P[X 0004 = x]). The conditional 0004 X given Y , denoted by H(X Y ), equals − y∈Y P[Y = y] x∈X P [X = x Y = y] log(P [X = x Y = y]). To quantify the amount of information that Y reveals about X, the notion of mutual information is usually involved. The mutual information of X and Y is the value I(X, Y ) defined by I(X, Y ) = H(X) − H(X Y ). The random variables X and Y are independent if and only if I(X, Y ) equals 0. Moreover, the mutual information is always positive or null and it satisfies the following property. Property 1. Let X and Y be two random variables respectively defined over X and Y. For every function f defined over Y, we have I(X, f (Y )) ≤ I(X, Y ). For our security analysis, we shall also use the following proposition. Proposition 1. Let X and Y be two random variables defined over X and let Z be a random variable defined over Z. If Z is mutually independent of X and Y and has a uniform distribution over Z, then for every measurable function f defined from X 2 into Z, we have I(X, Z ⊕ f (X, Y )) = 0. As a consequence of Proposition 1, we have I(X, Z ⊕ X) = 0 when X and Z satisfy the conditions of Proposition 1.
236
4.2
E. Prouff and M. Rivain
Evaluation Methodology
To evaluate the security of our proposal, we follow the outlines of the methodology depicted in [20]. This methodology holds in five steps: specify the target implementation, specify the target secret, define the adversary model, evaluate the information leakage and define a metric to evaluate the security. The target implementation is Algorithm 2. running on a smart card. The target secret is the un-masked value x corresponding to the masked input x ˜ of Algorithm 2. The adversary model. We assume that the attacker can query the targeted cryptographic primitive with an arbitrary number of plaintexts and obtain the corresponding physical observations, but cannot choose its queries in function of the previously obtained observations (such a model is called non-adaptive known plaintext model in [20]). We also assume that the attacker has access to the power consumption and electromagnetic emanations of the device and applies a first order DPA attack but is not able to perform HODPA. The effectiveness of the prediction made by the adversary is strongly related to the amount of information provided by the physical observations. In our analysis, we assume that the attacker knows how information leaks from the device and straightforwardly makes its prediction based on the leakage model (such a prediction is called device profiled prediction in [20]). Moreover, the physical observations are assumed to be perfect i.e. matching exactly the leakage model (which is a very favorable situation from the attacker’s viewpoint). For our analysis, we choose to consider a general leakage model that can be used for both power consumption or electromagnetic emanations. The leakage model. Different models coexist to quantify the leakage of CMOS circuits with respect to the data handled but two of them are predominantly used: the Hamming Weight model (where the leakage is related to the Hamming Weight of the data handled) and the Hamming Distance model (where the leakage is related to the Hamming Distance between the previous and the current data handled in a register or transmitted through a bus - see [3]). In the Hamming Distance model, the leakage of the bit-transitions 0 → 1 and 1 → 0 are assumed to be equal, which makes the leakage analysis much simpler. However this assumption, which is adequate when trying to attack an unprotected implementation, is not relevant to model a strong opponent against a secure implementation. Indeed, in practice CMOS gates leak differently when charging or discharging the load capacitance (especially in the case of electromagnetic emanations [17]). Hence, as mentioned in [17,20], a more accurate leakage model must be defined to model an attacker who is able to observe these differences.
A Generic Method for Secure SBox Implementation
237
Definition 1. [17] In the Hamming Distance Extended (HDE) model, the leakage LHDE (s, y) related to a variable y that replaces an initial state s, satisfies LHDE (s, y) = N0→1 (s, y) × P0→1 + N1→0 (s, y) × P1→0 + β ,
(2)
where N0→1 (s, y) (resp. N1→0 (s, y)) denotes the number of transitions 0 → 1 (resp. 1 → 0) from s to y, P0→1 (resp.P1→0 ) denotes the average energy consumed by a transition 0 → 1 (resp. 1 → 0) and where β denotes some noise. −P1→0 Denoting by δ the normalized difference P0→1 and by ε the leakage P0→1 , P0→1 we have P1→0 = ε(1 − δ) and Relation (2) can be rewritten:
0005 0006 δ δ LHDE (s, y) = ε 1 − HW(s ⊕ y) + ε (HW(y) − HW(s)) + β . 2 2
(3)
From Relation (3), it can be verified that the HDE model includes the Hamming Weight (HW) and the Hamming Distance (HD) models. Indeed, when there is no difference between the transitions 0 → 1 and 1 → 0 (i.e. when δ = 0), we have LHDE (s, y) = εHW (s ⊕ y) + β and the HDE model is equivalent to the HD model. On the other hand, when the initial state s is constant, equal to 0, then we have LHDE (s, y) = εHW (y) + β and the HDE model is equivalent to the HW model. The HDE model is also appropriate to quantify electro-magnetic emanations leaking in the signed distance model which assumes that P1→0 equals −P0→1 [17]. In this case, we have δ = 2 and the leakage satisfies LHDE (s, y) = ε(HW (y) − HW (s)) + β. Once the behavior of the device and the attacker capacities are modeled, a method can be deduced from Standaert et al. [20] to prove the security of a countermeasure. Evaluation of the security. Let us denote by X, Y and IS the random variables respectively corresponding to the sensitive data targeted by the attacker, the data manipulated at the date of a leakage and the initial state replaced by Y . In the theoretical model depicted by the four steps above, it has been proved in [18] and [20] that a first order DPA does not succeed in extracting information about X if and only if X and LHDE (IS, Y ) are independent for every pair (IS, Y ) appearing during the execution of the algorithm. In the following section, we evaluate I(X, LHDE (IS, Y )) for every pair (IS, Y ) appearing during the execution of Algorithm 2. 4.3
Proof of Security
If IS is an operation code, a constant memory address or a system variable which is independent of the intermediate results of the algorithm, then Property 1 and the positivity of the mutual information imply the following inequality: 0 ≤ I(X, LHDE (IS, Y )) ≤ I(X, Y ) .
(4)
In such a case, proving I(X, Y ) = 0 is sufficient to prove I(X, LHDE (IS, Y )) = 0.
238
E. Prouff and M. Rivain
If IS corresponds to an intermediate result of the algorithm, then Property 1 and the positivity of the mutual information imply the following inequality: 0 ≤ I(X, LHDE (IS, Y )) ≤ I(X, (IS, Y )) ,
(5)
where (IS, Y ) denotes the random variable that has the joint distribution of IS and Y . In this case, proving that I(X, (IS, Y )) equals 0 is sufficient to prove that I(X, LHDE (IS, Y )) equals 0. To show that I(X, LHDE (IS, Y )) equals 0 for every pair (IS, Y ) appearing during the execution of Algorithm 2., we decompose our security proof into two steps, depending on the nature of IS. In a first step, we show that for every intermediate variable Y manipulated by Algorithm 2., the mutual information I(X, Y ) equals 0. According to Inequality (4), this will prove that X and LHDE (IS, Y ) are independent when IS is assumed to be an operation code, a constant memory address or a system variable: we shall say in this case that there is no variables leakage. In a second time, we show that for every transition occurring between two intermediate results Y1 and Y2 , the mutual information I(X, (Y1 , Y2 )) equals 0. According to Inequality (5), this will prove that X and LHDE (IS, Y ) are independent when IS corresponds to an intermediate result of the algorithm: we shall say in this case that there is no transitions leakage. Variables leakage. We decompose Algorithm 2. into several elementary operations each manipulating an intermediate result computed from the sensitive variable X, the input mask R and the output mask S. Since the random variables R and S correspond to randomly generated values, we can assume that they have a uniform distribution and that 0003a X, R and S are mutually independent. j=0 F (X ⊕ R ⊕ j) and let tmp denote Let suma (X, R) denotes the sum j0003=R the register used to store the intermediate results at Step 5. Table 2 lists the intermediate values that occur during an execution of Algorithm 2. As R and S have a uniform distribution and as X, R and S are mutually independent, one straightforwardly deduces from Proposition 1 that all the intermediate results listed in Table 2 are independent of X. Table 2. Intermediate results manipulated during Algorithm 2 Step 5.
Instruction
Intermediate results
tmp ← x ˜
X⊕R
tmp ← tmp ⊕ a
X⊕R⊕a
tmp ← F (tmp)
F (X0007⊕ R ⊕ a)
Rcmp ← Rcmp ⊕ tmp
S⊕ S⊕
0
if R = a
0007 suma−1 (X, R) otherwise F (X) if R = a suma (X, R)
6.
cmp ← compare(R0 , R1 ) F (X) ⊕ S
7.
return R0 ⊕ (cmp × R1 ) F (X) ⊕ S
otherwise
A Generic Method for Secure SBox Implementation
239
Table 3. Transitions between intermediate results occurring during Algorithm 2 Step
Operation
Target
Initial State IS
5
tmp ← x ˜
tmp
F (X ⊕ R ⊕ (a − 1))
X ⊕R
5
tmp ← tmp ⊕ a
tmp
X ⊕R
X ⊕R⊕a
5
tmp ← F (tmp)
A-BUS
adF + (X ⊕ R ⊕ (a − 1))
adF + (X ⊕ R ⊕ a)
5
tmp ← F (tmp)
D-BUS
F (X ⊕ R ⊕ (a − 1))
F (X ⊕ R ⊕ a)
5
tmp ← F (tmp)
S-BUS
F (X ⊕ R ⊕ (a − 1))
adF + (X ⊕ R ⊕ a)
5
tmp ← F (tmp)
S-BUS
5
tmp ← F (tmp)
tmp
5
Rcmp ← Rcmp ⊕ tmp
Rcmp
R1 ← R1 ⊕ cmp
cmp
7
New State Y
adF + (X ⊕ R ⊕ a) ⎧ ⎨ 0 S⊕ ⎩ sum
F (X ⊕ R ⊕ a)
X ⊕R⊕a if R = a a−1 (X, R) F (X ⊕ S)
otherwise
S⊕
⎧ ⎨ ⎩
F (X ⊕ R ⊕ a) F (X)
if R = a
suma (X, R)
otherwise
F (X ⊕ S)
Transitions leakage. We consider hereafter the transitions between intermediate results that occur either on the bus or in the registers R0 , R1 , cmp and tmp. For the bus, we consider transitions that appear when memory addresses and data transit either on the same bus (here denoted S-BUS for single bus) or on different bus (an address bus denoted A-BUS and a data bus denoted D-BUS). Let adF denote the memory address of the look-up table F . In Table 3, we list the successive bus or register transitions occurring during an execution of Algorithm 2. We only list the transitions involving the sensitive data X. For all except the fifth row of Table 3, Proposition 1 and Property 1 straightforwardly imply that I(X, (IS, Y )) equals 0. For the fifth row, which corresponds to the update of Rcmp , let us study (IS, Y ) for R = a and R = a. If R equals a, then (IS, Y ) can be written (S, S⊕F (X⊕R⊕a)) where S is uniformly distributed over Fm 2 and is independent of the pair (X, R). If R differs from a, then (IS, Y ) can be written (S⊕suma−1 (X, R), S⊕suma(X, R)) that is (S 0007 , S 0007 ⊕F (X ⊕R⊕a)) after denoting S ⊕ suma−1 (X, R) by S 0007 . It can be verified that S 0007 has a uniform 0007 distribution over Fm 2 . Thus, due to Proposition 1, S is independent of (X, R). One deduces that (IS, Y ) is equivalent to a random variable (U, U ⊕F (X ⊕R⊕a)) where U is uniformly distributed over Fm 2 and independent of the pair (X, R). Then, from Property 1, we get I(X, (IS, Y )) ≤ I(X, (U, X ⊕R)) and Proposition 1 implies that I(X, (IS, Y )) equals 0. We showed in this section that the sensitive variable X is independent of every intermediate result Y and every transition IS → Y that occurs during the execution of Algorithm 2. As argued in Sect. 4.2, this implies that there is no mutual information between the sensitive variable and the instantaneous power consumption leakages. We can therefore conclude that Algorithm 2. is secure against first order DPA in the HDE model.
5
Conclusion
In this paper we have presented a new masking scheme for software SBox implementations that requires no RAM allocation. Since our method does not rely on specific SBox properties, it is generic and can thus be applied to protect any symmetric cryptosystem. We have argued that a first order DPA countermeasure
240
E. Prouff and M. Rivain
must be efficient not only in the single-mask protection mode but also in the multi-mask mode. In this mode, we have shown that our countermeasure is as efficient as the other classical generic methods and does not require RAM allocation. We have evaluated our solution within the framework recently introduced by Standaert et al. in [17, 20], proving its security against first order DPA under realistic assumptions about the attacker and the device behaviors. Finally, we have applied our method to AES and we have compared its efficiency with other secure implementations. Based on this analysis, we think that the timing and memory overhead of our countermeasure are suitable for practical implementations of the AES algorithm when a protection in multi-mask protection mode is required.
Acknowledgements We would like to thank Christophe Giraud and Emmanuelle Dottax for their fruitful comments and suggestions on this paper.
References 1. Akkar, M.-L., Giraud, C.: An Implementation of DES and AES, Secure against Some Attacks. In: Ko¸c, C ¸ .K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 309–318. Springer, Heidelberg (2001) 2. Boneh, D., DeMillo, R., Lipton, R.: On the Importance of Eliminating Errors in Cryptographic Computations. Journal of Cryptology 14(2), 101–119 (2001) 3. Brier, E., Clavier, C., Olivier, F.: Correlation Power Analysis with a Leakage Model. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 16–29. Springer, Heidelberg (2004) 4. Chari, S., Jutla, C., Rao, J., Rohatgi, P.: Towards Sound Approaches to Counteract Power-Analysis Attacks. In: Wiener, M.J. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 398–412. Springer, Heidelberg (1999) 5. Goli´c, J., Tymen, C.: Multiplicative Masking and Power Analysis of AES. In: Kaliski Jr., B.S., Ko¸c, C ¸ .K., Paar, C. (eds.) CHES 2002. LNCS, vol. 2523, pp. 198–212. Springer, Heidelberg (2003) 6. Goubin, L., Patarin, J.: DES and Differential Power Analysis – The Duplication Method. In: Ko¸c, C ¸ .K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 158–172. Springer, Heidelberg (1999) 7. Gueron, S., Parzanchevsky, O., Zuk, O.: Masked Inversion in GF(2n ) Using Mixed Field Representations and its Efficient Implementation for AES. In: Embedded Cryptographic Hardware: Methodologies and Architectures, pp. 213–228. Nova Science Publishers (2004) 8. Joye, M., Paillier, P., Schoenmakers, B.: On Second-Order Differential Power Analysis. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 293–308. Springer, Heidelberg (2005) 9. Joye, M., Yen, S.-M.: The Montgomery Powering Ladder. In: Kaliski Jr., B.S., Ko¸c, C ¸ .K., Paar, C. (eds.) CHES 2002. LNCS, vol. 2523, pp. 291–302. Springer, Heidelberg (2003)
A Generic Method for Secure SBox Implementation
241
10. Junod, P., Vaudenay, S.: FOX: a new family of block ciphers. In: Handschuh, H., Hasan, M.A. (eds.) SAC 2004. LNCS, vol. 3357, pp. 114–129. Springer, Heidelberg (2004) 11. Messerges, T.: Securing the AES Finalists Against Power Analysis Attacks. In: Schneier, B. (ed.) FSE 2000. LNCS, vol. 1978, pp. 150–164. Springer, Heidelberg (2001) 12. Messerges, T.: Using Second-Order Power Analysis to Attack DPA Resistant software. In: Paar, C., Ko¸c, C ¸ .K. (eds.) CHES 2000. LNCS, vol. 1965, pp. 238–251. Springer, Heidelberg (2000) 13. Oswald, E., Mangard, S., Herbst, C., Tillich, S.: Practical Second-Order DPA Attacks for Masked Smart Card Implementations of Block Ciphers. In: Pointcheval, D. (ed.) CT-RSA 2006. LNCS, vol. 3860, Springer, Heidelberg (2006) 14. Oswald, E., Mangard, S., Pramstaller, N., Rijmen, V.: A Side-Channel Analysis Resistant Description of the AES S-box. In: Gilbert, H., Handschuh, H. (eds.) FSE 2005. LNCS, vol. 3557, pp. 413–423. Springer, Heidelberg (2005) 15. Oswald, E., Schramm, K.: An Efficient Masking Scheme for AES Software Implementations. In: Song, J., Kwon, T., Yung, M. (eds.) WISA 2005. LNCS, vol. 3786, pp. 292–305. Springer, Heidelberg (2006) 16. Oswald, E.: Stefan, and N. Pramstaller. Secure and Efficient Masking of AES – A Mission Impossible? Cryptology ePrint Archive, Report 2004/134 (2004) 17. Peeters, E., Standaert, F.-X., Quisquater, J.-J.: Power and Electromagnetic Analysis: Improved Model, Consequences and Comparisons. In Integration, the VLSI Journal. Elsevier, Spring (to appear) 18. Prouff, E., Giraud, C., Aumonier, S.: Provably Secure S-Box Implementation Based on Fourier Transform. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249, pp. 216–230. Springer, Heidelberg (2006) 19. Rudra, A., Bubey, P.K., Jutla, C.S., Kumar, V., Rao, J., Rohatgi, P.: Efficient Rijndael Encryption Implementation with Composite Field Arithmetic. In: Ko¸c, C ¸ .K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 171–184. Springer, Heidelberg (2001) 20. Standaert, F.-X., Malkin, T.G., Yung, M.: Side-Channel Resistant Ciphers: Model, Analysis and Design. Cryptology ePrint Archive, Report 2006/139 (2006) 21. Trichina, E.: Combinatorial Logic Design for AES SubByte Transformation on Masked Data. Cryptology ePrint Archive, Report 2003/236 (2003) 22. Trichina, E., Korkishko, L.: Secure and Efficient AES Software Implementation for Smart Cards. In: Lim, C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325, pp. 425–439. Springer, Heidelberg (2005) 23. Waddle, J., Wagner, D.: Toward Efficient Second-order Power Analysis. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 1–15. Springer, Heidelberg (2004) 24. Yen, S.-M., Joye, M.: Checking before output may not be enough against faultbased cryptanalysis. IEEE Transactions on Computers 49(9), 967–970 (2000)
A
Application to AES
We recall that the AES SBox is composed of two parts: a non-linear function and an affine mapping. In the following we focus on the non-linear part, which will be denoted here by F . Let p(x) denotes the irreducible polynomial x8 ⊕ x4 ⊕
242
E. Prouff and M. Rivain
x3 ⊕ x ⊕ 1 ∈ F2 [x]. The function F is defined in F2 [x]/p(x) by F (a) = 0 if a = 0 and by F (a) = a−1 otherwise. At first, we applied the method depicted in Algorithm 2. for n = 8 to protect the SBox access of the AES algorithm. We implemented the solution on a classical 8051 chip running at 8 Mhz and we studied the performances of the implementation. Clearly, the timing of the resulting AES algorithm was not interesting (around 115 ms and 1, 8 KB of ROM) compared to 5ms for an implementation without countermeasures. Secondly, we represented F28 has an extension of F24 , allowing us to perform the computations in F24 instead of F28 (such a method is usually called composite field approach). We chose the two irreducible polynomials p0007 (x) = x2 + x + {e} and p00070007 (x) = x4 + x + 1 in F4 [x] and F2 [x] respectively and we denoted by map the field isomorphism which takes an element a of F2 [x]/p(x) as input and outputs the pair (ah , al ) ∈ (F2 [x]/p00070007 (x))2 corresponding to the coefficients of the linear polynomial (ah x + al ) ∈ F24 [x]/p0007 (x). Moreover, we denoted by InvF24 the function which corresponds to the inverse function over F2 [x]/(x4 + x + 1){0} and which maps 0 into itself. In the following, we depict the different steps of our computation: Algorithm 3. Inversion of a masked element a = a ⊕ ma in F28 Input: ( a = a ⊕ ma , ma ) ∈ F28 2 −1 = a−1 ⊕ m0002 , m0002 ) Output: (a a a 1. Pick up three 4-bit random md , m0002h and m0002l 2. (mh , ml ) ∈ F224 ← map(ma ) a) 3. ( ah , a000bl ) ∈ F224 ← map( 2
h ⊗ a000bl ⊕ a000bl 2 ⊕ md ⊕ a
h ⊗ ml 4. d ← a
h ⊗ {e} ⊕ a ⊕ a000bl ⊗ mh ⊕ m2h ⊗ {e} ⊕ m2l ⊕ mh ⊗ ml −1 ← Algorithm 2.(d, md , md−1 , InvF ) 5. d 24
[( ah , a000bl ) = (ah ⊕ mh , al ⊕ ml )] [d = d ⊕ md ] −1 = d−1 ⊕ m −1 ] [d d 0002 0002
[a = ah ⊕ m0002h ]
−1 ⊕ m0002 ⊕ m ⊗ d −1 ⊕ m −1 ⊗ a
0002 ← a
h ⊗ d
h ⊕ md−1 ⊗ mh 6. a h h d h 0002 0002 0002 −1 ⊕ m ⊕ a −1 ⊗ m ⊕ a
7. a000b0002 ← a000bl ⊗ d ⊕ d ⊗ m −1 ⊕ mh ⊕ ml ⊗ m l l l l
8. m0002a ← map−1 (m0002h , m0002l )
0002 , a000b0002 ) −1 ← map−1 (a 9. a 10. return
h l −1 , m0002 ) (a a
h
d
h
d−1
[a000b0002l = a0002l ⊕ m0002l ]
−1 = a−1 ⊕ m0002 ] [a a
For this version, the timing of the resulting AES algorithm are very interesting and the input and output masks can be changed at each execution of the algorithm. In the following table, we have listed the timing/memory performances of our proposal and the ones of other methods proposed in the Literature. As the performances have been measured for a particular implementation on a particular architecture, the table above does not aim at arguing that a method is better than another but aims at enlightening the main particularities (timing performances and ROM/RAM requirements) of each method.
A Generic Method for Secure SBox Implementation
243
Table 4. Comparison of several methods to protect AES against DPA Method Timings (ms) RAM (bytes) ROM (bytes) Multi-masking Straightforward implementation 5 0 1150 Re-computation Methods in the single-mask mode Re-computation Method in F28 [11] ×1.42 +256 +49% not allowed Re-computation Method in F24 [11] ×2.60 +16 +150% not allowed Re-computation Methods in the multi-mask mode Re-computation Method in F28 [11] ×50, 60 +256 +49% allowed Re-computation Method in F24 [11] ×5.86 +16 +150% allowed Secure SBox computation methods based on the composite field approach Oswald et al. [14, 16] ×5.20 0 +173% allowed This paper (Algo. 3.) ×5.30 0 +150% allowed Prouff et al. [18] ×6.40 0 +147% allowed Methods with security under discussion Oswald and Schramm [15] ×2.40 0 +200% allowed Trichina et al. [22] ×4.20 +256 +165% not allowed
The AES implementations listed above only differ in their approaches to protect the SBox access. The linear steps of the AES have been implemented in the same way and the internal sensitive data have been masked by bitwise addition of a random value. We chose to protect only rounds 1 to 3 and 8 to 10, assuming that the diffusion properties of the AES algorithm make DPA attacks impossible to mount on inner rounds 4 to 7 (this implies that the SBox calculations made in rounds 4 to 7 are performed by simply accessing the table representation of F which is stored in ROM). In the single mask mode, the re-computation method in F28 has the best timing performances but at least 256 bytes of RAM must be allocated to store the re-computed SBox table. As RAM is a sensitive resource in the area of embedded devices, we implemented a second version which follows the outlines of the composite field approach and then applies the re-computation method in F24 . Because only 16 bytes of RAM are required to store the table re-computed from the InvF24 function, the new implementation requires much less RAM than the version in F28 and the timing performances are suitable for practical applications. As the multi-mask protection mode offers better security with respect to power analysis attacks [5, 15], we tested the re-computation method in this mode. As expected, the re-computation method in F28 no longer gives full satisfaction in this mode (requiring 253 ms for one AES execution). The timing performances of the re-computation method in F24 are acceptable in the multi-mask protection mode, however they are close to (and even slightly greater than) the performances of the SBox secure calculation methods. Moreover, 16 bytes of RAM are required. The SBox secure calculation methods of Oswald et al., Prouff et al. and our proposed approach only differ in the ways of securely computing the value ˜ md and md−1 (i.e. to securely perform the fifth Step of d−1 ⊕ md−1 from d, Algorithm3.): – In [14,16], the inversion is performed by going down to F4 and its complexity approximatively equals the one of Algorithm 3. excluding the 5th Step which
244
E. Prouff and M. Rivain
is replaced by a square operation (since the inversion operation in F4 is equivalent to squaring). For our implementation of [14, 16], the number of cycles required for the fifth step is 267. – In [18], the inversion is essentially performed by computing a Fourier transform on F42 . For our implementation of [18], the number of cycles required for the fifth step is 468. – For the new solution presented here, the fifth step essentially corresponds to the computation of y −1 ⊕ md−1 for every y ∈ F42 . For our implementation, the number of cycles required by the fifth step is 270. The three methods can be used in the multi-mask protection mode without decreasing the performances of the implementation and they offer approximatively the same (good) level of security related to first order DPA attacks. The execution timings of AES implementations based on our proposal or on Oswald et al.’s method are very close and their RAM requirements are almost equal. The additional time required by the Prouff et al. method is slightly greater, however the code seems to be shorter (2844 bytes of ROM versus 2881 and 3144 bytes of ROM for our method and the Oswald et al.’s method). The methods proposed by Trichina [22] and Oswald-Shramm [15] have good timing performances but are not perfectly resistant to first order DPA attacks. – In the method of Trichina et al., a primitive element of F28 is computed and every non-zero element of F28 is expressed as a power of that element. To resolve Problem 1 for the bitwise operation, Trichina et al. use pre-computed discrete logarithm and exponentiation tables to realize the SBox operation. As argued in [15], the method has a faulty behavior when some intermediate values are null and to correct the method without introducing a flaw with respect to first order attacks seems to be an issue. – The method proposed by Oswald and Shramm offers the best timing performances. As for Algorithm 3., it is based on the composite field approach but steps 4 to 7 are replaced by a sequence of table look-ups and bitwise additions. The table look-ups have been render resistant to first order DPA attacks by applying the global look-up table method (which is recalled in Sect. 3). For example, the computation of d−1 ⊕ md−1 (Step 5) is performed by accessing the table Tinv associated to the function ((d ⊕ md ), md ) ∈ (F24 )2 0002→ (d−1 ⊕ md ) ∈ F24 . As argued in Sect. 3, the global look-up table method has a flaw with respect to first order DPA attacks. Indeed, to address the Tinv table the value (d ⊕ md ) md is manipulated, which results in a power consumption that leaks information on the sensitive value d. For instance, it can be checked that I(d, H((d ⊕ md ) md )) is not null, which results in an information leakage in the Hamming Weight model. Moreover, the input and output masks being equal, this method has also a potential flaw in the Hamming Distance model. Indeed, if a transition occurs between the index (d ⊕ md ) md and the value d−1 ⊕ md accessed in the look-up table (which is very likely in a single bus architecture), the mask md is canceled and information leaks about d and/or d−1 .
On the Security of a Popular Web Submission and Review Software (WSaR) for Cryptology Conferences Swee-Won Lo1,0002 , Raphael C.-W. Phan2 , and Bok-Min Goi3 1
School of Electrical & Electronics Engineering & Computer Science, Kyungpook National University, Sankyuk-dong, Buk-gu, Daegu 702-701, Korea [email protected] 2 Laboratoire de s´ecurit´e et de cryptographie (LASEC), Ecole Polytechnique F´ed´erale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland [email protected] 3 Centre for Cryptography and Information Security (CCIS) Faculty of Engineering, Multimedia University, 63100 Cyberjaya, Malaysia [email protected]
Abstract. Most, if not all, conferences use an online system to handle paper submissions and reviews. Introduction of these systems has significantly facilitated the administration, submission and review process compared to traditional paper-based ones. However, it is crucial that these systems have strong resistance against Web attacks as they involve confidential data and privacy. Some submissions could be leading edge breakthroughs that authors do not wish to leak out and be subtly plagiarized. Also, security of the employed system will attract more submissions to conferences that use it and gives confidence of the quality that the conferences uphold. In this paper, we analyze the security of the Web-Submission-and-Review (WSaR) software - latest version 0.53 beta at the time of writing; developed by Shai Halevi from IBM Research. WSaR is currently in use by top cryptology and security-related conferences including Eurocrypt 2007 & 2008, Crypto 2007, and Asiacrypt 2007, annually sponsored by the International Association for Cryptologic Research (IACR). We present detailed analysis on WSaR’s security features. In particular, we first discuss the desirable security features that are designed into WSaR and what attacks these features defend against. Then, we discuss how some untreated security issues may lead to problems, and we show how to enhance WSaR security features to take these issues into consideration. Our results are the first known careful analysis of WSaR, or any type of online submission system for that matter. Keywords: Web submission and review software, security analysis, privacy, passwords, email, protocol. 0002
Part of work done while the author was at [email protected] University (Cyberjaya campus) and iSECURES [email protected] University of Tech (Sarawak campus).
S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 245–265, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
246
1
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Introduction
About two decades ago, authors interested to submit papers to a conference would submit via postal service or airmail. After being reviewed, papers (together with the reviews) would be sent back by similar means. Before the camera-ready deadline, authors of accepted papers would need to race against time to send their camera-ready versions over, and hope that they do not get lost in transit. This method of correspondence is not only costly, it is also time-consuming. More notably, it will be hard to trace lost or delayed papers and reviews since it all depends on the reliability of the postal system. Fast forward a few years, technology advances introduce the use of email (attachments) and facsimile. Although most email services are free of charge, this method is often limited in terms of the size of the attachment(s). Facsimile, on the other hand, can be costly although fast. Thus, the introduction of online web-based systems significantly facilitates the paper submission and review process [26], and overcomes shortfalls in the traditional paper-based system. Authors and reviewers can track their papers’ progress anytime, anywhere, as long as they have an Internet connection. In addition, the conference Chair is now capable of managing papers and reviews more effectively, as well as reacting quickly to feedback and complaints. According to a survey [26] by ALPSP1 on web submission and review systems for journals, among the 442 respondents selected at random from the ISI Web of Knowledge database, 81 per cent preferred to use web submission and review systems and 36 per cent said that they would think twice when choosing a journal without online submission for their work. Following the introduction of online submission, there was a 25 per cent increase in submission volumes and publishers reported a 30 per cent decrease in administration time. From this survey, we see that online submission and review systems are playing a significant role. Nevertheless, popular though they be, there is still an issue involved that should be a major concern among the community - how secure are the data handled by these systems? In this setting, security and privacy can be seen from two opposite perspectives. One, arguably less eminent, is the risk of malicious individuals attempting to obtain unauhorized access to leading edge research results and thus idea theft, or cause unfair dismissal of submitted papers. On the other is the case of honest paper authors desiring that the submission system maintains their privacy and secrecy of research ideas, and be able to verify to themselves and prove later to others that their submissions are properly handled by the system; at least that any errors should be detectable without unnecessary delay. Also desirable to the honest reviewer is that reviewer anonymity is upheld. In this paper, we analyze the web-submission-and-review system (known as WSaR from here onwards) [10] developed by Shai Halevi from IBM Research. WSaR is currently in use by top cryptology conferences including Eurocrypt 2007 & 2008, Crypto 2007 and Asiacrypt 2007, annually sponsored by the International Association for Cryptologic Research (IACR) [11]. See Appendix B for a longer list. 1
Association of Learned and Professional Society Publishers.
On the Security of a Popular Web Submission
2
247
WSaR and Its Security Features
WSaR is open source and is hosted at SourceForge [23]. While analyzing its HyperText Preprocessor (PHP) scripts, we found some security features that have been designed into WSaR to protect against several common Web attacks. This section will analyze how the features are added and the type of attacks the features defend against. 2.1
Password Strength
Passwords are often the first defense against intrusion. Relative to online submission and review systems, passwords are the first fortification to ensure the quality of conference proceedings because they are used to safeguard the submissions and their reviews. In order to gain initial access to the administration or review sites, WSaR computes and generates unique passwords for the conference Chair and reviewers respectively. Firstly, after the customization phase, the Chair will be given a 10-alphanumeric-character password to log in to the administration site. Once logged in, he has the option to change the default password. The same goes to the reviewers (in most security conferences, a reviewer with access to the online system is called a program committee member) - soon after the Chair grants the reviewers access to the review site, they will receive a notification email that gives them the password to log on to their own review sites. Here, they will be able to view the list of submissions (for which they have no conflict of interest), change their reviewing preferences, post their reviews, participate in paper discussions or take part in a ballot. On the other hand, throughout the submission phase, a distinct submission-ID and password will be generated for every paper. Authors will need these parameters to revise or withdraw the papers. However, they do not have the option to change the submission password. These “WSaR-generated” passwords are 10-alphanumeric-character strings. Each character is either uppercase (A-Z), lowercase (a-z), numerals (0-9), or sometimes a tilde (˜) or an underscore ( ). Figure 1 illustrates how passwords are generated in WSaR: 32 hexadecimal-character string
A random string 10753459c7300b71b5838899114
MD5 hash function
5870660775b9e7e9551c9f314c5bb651
Extracted 16 hexadecimal-character string 5870660775b9e7e9
10-alphanumeric-digit string Encoding scheme
M71c1tMvv_
Fig. 1. Password generation process for the reviewer
248
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Fig. 2. An example email where generated password is emailed to the conference Chair
1. For the Chair and PC members: A long random string is generated using PHP functions such as uniqid(), mt_rand(), and rand(). This random string has length between 15 to 28 digits. For the submissions: A long random string is generated using the similar functions as above. This string is then appended by the submission’s title and the author’s name. 2. The hash of the resulting string is then computed via PHP’s MD5 function. 3. The first 16 hexadecimal digits of the resulted message digest are extracted. 4. A custom WSaR encoding function is used to compress these extracted digits into a 10-digit alphanumeric string. 5. The resulting string (which is the password) will be emailed to the user (see Figure 2). Note that an alphanumeric character corresponds to 6-bit entropy, so the entire 10-alphanumeric-character string requires an exhaustive effort of 260 to brute force. Therefore, the passwords generated by WSaR are deemed to be secure. 2.2
Password Storage for Conference Chair and PC Members
No matter where passwords are stored on the online system server, they should be stored in encrypted or hashed form, similar to multi-user operating systems like Unix, Linux etc. The most popular way to seal passwords in a database is via password hashing. There are currently two most commonly used hash functions, namely the Secure Hash Algorithm-1 (SHA-1) and the Message Digest 5 (MD5). MD5 produces a hash that is 128 bits long (equivalent to 32 hexadecimal characters) while SHA-1 computes a hash that is 160 bits long (equivalent to 40 hexadecimal characters). Among these two, MD5 is the more commonly used hash function to safeguard passwords as well as to ensure message or software integrity. Likewise, this function is also employed in WSaR when it comes to storing passwords.
On the Security of a Popular Web Submission
A random string
249
32-hexadecimal-character string MD5 hash function
1014245b61e33a37011350915937
2fa0d14fca37b6dce99aecc44521f298
22-alphanumeric-digit salt string BIY05o7gnK3vvHRaepMt1R
Compression function
salt string email password [email protected]_
32-hexadecimal-character string MD5 hash function
2f857cace861d81979f78854399058eb
Database
Fig. 3. Generation of salt string and password storing process
Here, we look at how WSaR protects the Chair and PC members’ passwords stored in the system database (see Figure 3): 1. Upon customization, a random string is generated using PHP functions such as uniqid(), mt_rand(), and rand(). The string’s message digest is computed and the digest is then compressed using a custom WSaR function into a 22-alphanumeric-character salt string. This salt string is saved and is constant throughout the entire conference. 2. The user’s email address and password are retrieved. 3. The salt string is then appended by the user’s email address and password, and it is fed to the MD5 hash function. 4. The resulting digest is stored into the database table (see Figure 4).
Fig. 4. The digests are stored in the database instead of the passwords
In this case, users’ passwords are not stored in the clear in the database, and it is infeasible for an attacker to reverse on the hash values to retrieve the preimages (passwords). In Section 3.5, we will discuss an issue with the storage of submissions’ passwords. Furthermore, this technique of appending a salt string to the email address and password before hashing it increases the difficulty in cracking the passwords using brute-force attack even if an attacker has access to the hash table. In Section 3.4, we discuss how using different salt strings (instead of a constant one) can increase the difficulty in cracking users’ passwords.
250
2.3
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Input Sanitization
As discussed in [8], SQL injection is considered one of the most dangerous threats to Web applications because it allows an attacker to connect to the back-end database and extract any data as he wants. An example of an SQL injection attack is the log in form where a user enters his username and password to be authorized, and the server will retrieve the user’s ID and credit card number, for example. Generally, the SQL query is shown in Table 1. Table 1. SQL statement to retrieve user’s ID and credit card number SELECT FROM WHERE AND
ID, CREDIT_NUM users username = ‘$username’ password = ‘$password’
Assuming an attacker enters “Jane” in the username field and provides the string “anything’ OR ‘a’=‘a” in the password field. The SQL query would become: Table 2. SQL statement as “modified” by the attacker SELECT FROM WHERE AND
ID, CREDIT_NUM users username = ‘Jane’ password = ‘anything’ OR ‘a’=‘a’
The ‘a’=‘a’ part is always true regardless of what the first part of the query contains, thus the attacker would be able to trick the application to obtain database data that is not supposed to be returned by the application. Successful SQL injection attack will also result in authentication bypass and database modification [15]. The “one rule” to defend against SQL injection (as well as cross-site scripting, buffer overflows etc.) is input sanitization. If this is done, the Web application will be 80 per cent more secure. There are two commonly used ways to validate input: (1) strip off any undesirable characters (such as meta-characters) and (2) check input data for expected data type [12]. Both ways are implemented in WSaR. We note that the potential SQL injection characters include: (”*;&<>/’ˆ) [7]. In WSaR, a possible exploitation of special characters for an SQL Injection attack is in the “Submission/Revision Receipt” page where, as an example, the receipt page’s URL for submission A with password ‘ABC’ will be“http://localhost/receipt.php? subId=A&subPwd=ABC”. In this case, query to the database will be as in Table 3 (as an example).
On the Security of a Popular Web Submission
251
Table 3. SQL statement to retrieve submission with subId=A and subPwd=ABC SELECT FROM WHERE AND
title, authors, abstract submissions subId=‘A’ subPwd=‘ABC’
Firstly, WSaR uses the my_addslashes() function in PHP to remove undesirable characters. If an attacker happens to be one of the submitters to the conference, she would know the pattern of the receipt page’s URL. Now, she is interested in finding out the paper submitted by her rival, so she launches an SQL Injection attack on the receipt page by changing the URL to “http://localhost/ receipt.php?subId=1&subPwd=1’% 20OR% 20‘1’=‘1, where %20 represents a space in its HTML entity. This time, the query will be as follows: Table 4. SQL query for an SQL Injection attack SELECT FROM WHERE AND
title, authors, abstract submissions subId=‘1’ subPwd=‘1’ OR ‘1’=‘1’
However, since all special characters are removed by the my_addslashes() function - single quotes are ignored using backslashes (commented out), the ‘1’=‘1’ part no longer makes sense. Thus, the attacker will receive a generic error message as shown in the screen shot in Figure 5. If the software does not sanitize user’s input in all PHP scripts (i.e. all my_addslashes() functions are removed from every WSaR scripts), the URL
Fig. 5. Attacker receives an error message
252
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Fig. 6. An attacker can click on the Revision or Withdrawal link to revise or withdraw the submission
constructed by the attacker will return submission 1’s receipt page and from there, she can redirect to the revision page and revise her rival’s paper or even withdraw it from the conference without her rival knowing it (see Figure 6). Secondly, developers should specify the type of input that is expected for certain form fields, and that the unexpected input will be removed. As an example, a form field that requests for user’s phone number should accept only numbers as input. WSaR always has a specific input type that is expected for the submission ID - it processes only integers for the submission ID field. In spite of that, developers can make use of regular expressions to tell the program the type of pattern in the text that it should look for [19]. If the data submitted does not match the regular expression, it will be ignored [7] or error messages will be generated. Both the above mentioned methods are employed in WSaR. Therefore, this system is not categorized as one of the 60 per cent of Web applications that is vulnerable to SQL injection (or other attacks due to invalidated input) [24]. 2.4
Resistance to Bypass of Access Control Checks Through Forced Browsing
Forced browsing refers to a technique used by attackers to access to resources that are not referenced, but are nevertheless accessible [25]. Access control checks are normally performed after a user gets authenticated and it monitors what authorized users are allowed to do. As an example, if a reviewer is blocked from reviewing certain submissions due to conflict of interest, he should not be able to bypass the access control checks by requesting the review form of that submission directly in the URL.
On the Security of a Popular Web Submission
253
Fig. 7. Review form for submission 1
Taking a look at the URL of a review form; to review submission A, reviewer will access the review form at the URL “http://localhost/review/ review.php?subId=A” (see Figure 7). Assume that the insider attacker is one of the reviewers in a conference and she is blocked from reviewing submission 1 since she is one of the authors of that submission. In order to make sure that her submission is accepted to the conference, the attacker needs good reviews for her paper. Thus, she tries to change submission A’s review form URL to which she has access, from “http://localhost/review/ review.php?subId=A” to “http://localhost/ review/review.php?subId=1” since she is prohibited to access the link directly from her review page. If the system then displays the review form for submission 1, the attacker has bypassed the access control checks through forced browsing. Fortunately, the attempt to bypass WSaR’s access control check is forbidden in the review form page, as well as the voting page and the reviewers’ discussion forum. Whenever a reviewer attempts to access a blocked submission by directly specifying the submission-ID in the URL, WSaR will firstly perform an authorization check in the reviewer table; if the reviewer is blocked from that submission, it will display an error message indicating that the submission is not found, or the reviewer has a conflict as shown in Figure 8.
Fig. 8. WSaR prevents broken access control
3
Security Issues and Enhancements
In addition to the security features already designed into WSaR discussed in Section 2, we have also discovered some security issues not treated in WSaR and in this section, we discuss them in detail and then describe ways to enhance WSaR security by taking them into consideration.
254
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
For any security issue, we will discuss its implications from the two opposing perspectives motivated in Section 1. We also highlight whether exploitation of issues can be traceable or uniquely pointing the finger to the culprit. This has devastating consequences if the attacker cannot be traced since it means even a curious (if not malicious) researcher from the scientific community could have mounted the attack without any counter-incentives i.e. no adverse effects on his reputation. Furthermore, issues that lead to attacks for which the culprit cannot be unambiguously accused will cause disputes for which a malicious attacker could deny his involvement or an honest user be unfairly thought by peers to have mounted an attack. 3.1
Browser Caching
Browser caching is categorized as one of the critical areas in OWASP’s Top Ten projects under “Broken Authentication and Session Management” [25]. In WSaR, the submission-ID and password are sent by the HTTP GET method. Information sent in such way will be displayed in the browser’s URL [18] and it is extremely undesirable (see Figure 9).
Fig. 9. Submission-ID and password are displayed in the page’s URL
This means the submission-ID and password are part of the URL. As has been highlighted in [25], authentication and session data should not be submitted as part of a GET to prevent someone malicious from using the “Back” button in an authorized user’s browser to backup the page and hence obtain the password from the URL. Alternatively, even if a browser window is closed, the URL info can also be obtained from the browser’s cache and history. Recall that browsers cache most of the contents of frequently visited pages so that the pages will load faster the next time they are visited, including images, sounds, cookies, web pages and their URLs. Information stored in browsers’ cache is not encrypted [1] and it can be obtained by anyone who accesses the computer. Both Netscape Navigator and Mozilla Firefox, by default require users to clear the browser’s cache (and disk cache - also known as “Temporary Internet Files”) manually. On the other hand, Internet Explorer will clear the browser’s cache but not the “Temporary Internet Files”, once the browser is restarted. In addition, previously visited URLs are typically stored in the browser’s history; for instance the latest versions (at the time of writing) of Netscape Navigator and Mozilla Firefox by default will only clear browser’s history every nine days, while the default for Internet Explorer is 20 days.
On the Security of a Popular Web Submission
255
Fig. 10. Internet Explorer’s temporary internet files that stores the receipt page’s URL
Figure 10 shows the receipt URLs stored in the “Temporary Internet Files” folder. In spite of that, if the user does not clear the browser’s cache or history after he uses the computer, the URLs will also be displayed in the browser’s history pane as shown in Figure 11.
Fig. 11. Submission ID and password are exposed to the attacker
This threat can be launched by any individual having access to a common machine previously utilized by a WSaR user, and upon retrieving the login information (ID and password), he can login as the WSaR victimized user. This attack is non-traceable in the sense that the attacker cannot be later pinpointed, so he can mount the attack without any risk of jeopardizing his reputation. Thus, the counter-incentive is non-existent that it will indeed be tempting for any individual to abuse this issue to the victim’s disadvantage. Browser caching can be prevented by submitting the submission-ID and password as part of a HTTP POST method [25] instead. If the POST method is used, we can employ the PHP’s predefined variable “$_POST” to retrieve the values entered in the submission-ID and password fields respectively. In this case, both submission-ID and password will not be displayed in the page’s URL. Other than this, WSaR users are advised to clear the browser’s cache before they leave the computer. In Netscape Navigator, users would have to clear both the memory and disk cache; in Mozilla Firefox, users should check all fields when clearing their private data [4]; in Internet Explorer, users should clear the history and all temporary internet files [21]. This way, users can ensure the security of their private information. 3.2
Constant Salt String for Reviewer and Chair Passwords
In Section 2.2, we discussed how WSaR uses a salt string appended to the reviewer’s email address and password to build a stronger defense in securing his
256
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
passwords. However, this salt string is constant throughout the entire conference for any party, thus all users of the same conference will have the same salt appended to their password. So what differentiates each user is just the email address (which is typically public information) and his password. This indicates that the salt string does not really provide better strength against insider attackers (users of the same conference) than conventional password-based authentication systems. Therefore, we recommend the use of different salt strings for different users. These salt strings will be stored in a single PHP script upon customization of WSaR, and they will be labeled in the sense that “SALT_2” will be used for PC member with the ID of 2 and so on. In this case, since the salt strings will be random and of different lengths, the attacker would have a much harder time trying to guess the exact salt that is appended to a specific user’s email address and password. Again, this threat is non-traceable as it allows a malicious insider individual to brute-force the password, upon which he can login as the victim without any evidence pointing back to him. 3.3
Storage of Submission Passwords
Throughout our analysis, we discovered that although WSaR stores the hashes of reviewers’ passwords, it does not do the same when it comes to storing submissions’ passwords (see Figure 12). For someone who manages to gain access to the database, he has the ID and password for every submission as well. We recommend that the submission passwords to be salted and hashed as well to secure them so that even if a hacker manages to penetrate the database’s security, he would face the prospect of a potentially expensive search for the exact password.
Fig. 12. Submissions’ passwords are stored in the clear in the database
3.4
Password Policy and Strength Checking
Using a good password by every user is vital to defending a system. The components of a good password should be at least eight characters long, should not consist of dictionary words (would be vulnerable to dictionary attack otherwise), should never be the same as the user’s log in name, should not consist of any item that is easily identified with the user, and have at least three of the following elements [5]: – One or more uppercase letters (A-Z) – One or more lowercase letters (a-z)
On the Security of a Popular Web Submission
257
– One or more numerals (0-9) – One or more special characters or punctuation marks However, it is worth noting that some systems do not permit the use of certain special characters in user’s password string. Although the passwords supplied by WSaR fulfill all the mentioned requirements, users might find the password hard to remember and opt to change it. Therefore, we analysed WSaR’s password change mechanism. WSaR does not have any password policies imposed to ensure the strength of new passwords, thus a careless user may happen to employ a password which is easily cracked by any password-cracking tools, without being reminded not to do so. Hence, we suggest that WSaR should force its users to adopt passwords that cannot be easily cracked. Google mail [9] uses a scale to show its users the strength of their passwords with parameters such as “Too short”, “Weak”, “Fair”, “Good” or “Strong”. This is a feature that could be added to WSaR.
Fig. 13. Google mail’s password strength evaluation
However, careless users tend to ignore the password strength evaluation and proceed to employing the new password. Consequently, the system’s security could still be breached. Having said that, WSaR’s password changing mechanism should perform an assessment on the new password - whether it is too short or it contains dictionary words or even a row of letters from a standard keyboard layout (e.g., ertyui) [14]. If a user’s new password happens to be a weak password, a pop up window should be issued to require him to re-enter a new password. Besides that, a password policy should also be implemented to require frequent change of passwords. If an attacker uses brute-force attack to crack a password, it is possible that he would be taking a long time (depending on the length of the password, speed of both the machine and network connection) to complete the attack [14]. In this case, if the user changes his password frequently, he could avoid his password from being cracked via brute-force attack. 3.5
Absence of File Integrity and Binding
Whenever we download any files or software, the first thing we would want to do is to check the file’s integrity. Presently, the most popular method used to verify a file’s integrity is via the use of the MD5 function. If the file content is modified, it can be easily detected by re-computing its hash value. In an online submission and review system, calculating a submission’s message digest is important both to ensure message integrity and to bind the author to the submitted paper due to absence of face-to-face communication. We note that WSaR does not have this feature, i.e. upon submission the file’s message digest is not computed and thus not supplied to both the author and the
258
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Chair. Therefore, considering the case of malicious users, if an author happens to submit a not-so-perfect paper in order to meet the submission deadline, he can later deny that he submitted that version of the paper, then claim that the file is corrupted and request for a second-time submission, thus gaining advantage over other authors. More devastatingly, an honest author could have submitted a proper paper but the file became corrupted by the server machine; in which case it is desirable to be able to unambiguously prove that the corrupted version is not the submitted file, or at least be able to check during submission that it was properly received. This avoids ambiguity when a Chair notes during review phase that a file is corrupted and is unsure if it was intentionally submitted in that form in order to buy authors some time, or if it was indeed submitted properly but was corrupted in transit. Indeed, this issue is often not so much to guard against malicious authors but rather so as to allow a scientifically honest author in this situation to be able to prove his innocence beyond any doubt. To counter this issue, we recommend that authors be presented with a 128bit hash of his submission file and that this hash value is kept as a record for the Chair. Since WSaR is developed in PHP, we can apply the “md5_file()” function where it calculates the message digest of the given file [18]. Firstly, a new column has to be created in the “submissions” table to record the message digest by adding a “msgDigest varchar(255) BINARY NOT NULL” statement in the “create_table( )” function, which can be found in WSaR’s database.php script. Subsequently, we added a statement to calculate the file’s message digest in the act-submit.php and act-revise.php scripts, which are used in WSaR to process all parameters entered in the submission and revision form respectively. The resulted digest is inserted into the database under the “msgDigest” column and authors will be redirected to the receipt page. An email will also be generated to the author, and carbon-copied to the Chair with all submission details listed (including the message digest). Now, we can be assured of the file’s integrity as well as binding the author to his submission. The screenshots are shown as Figure 14 and Figure 15 in Appendix A for further illustration.
4
Protocol Sketch for Password Distribution Via Email
In general (not specific to WSaR), passwords for reviewers are commonly sent via emails to the reviewers’ email addresses. This is indeed the most practical way to distribute passwords, although it is known that email formats by default (and this is the setting that users commonly use) do not provide any form of confidentiality nor authenticity, unless explicit email clients or plug-ins like PGP are applied. We take what we view as a concrete step towards securing online systems by motivating here and sketching the basic idea of our ongoing work: a proposal for an email-based password distribution protocol for security conferences. Having this kind of protocol in place will prevent reviewer passwords from being easily
On the Security of a Popular Web Submission
259
compromised through attacks mounted not on the system itself but on how this password is distributed. Indeed, it is well known that the study of passwordbased key exchange protocols is a long-standing research topic in cryptology, thus we should avoid having security conferences using in practice the emailbased password distribution protocols that are not securely designed. The setting for an email ID-based password distribution protocol is different from those in typical key exchange protocols [3]. The protocol involves two parties, the program chair C and the reviewer R. Rather than requiring any public-key infrastructure (PKI), only a trusted web site bulletin board is used, for instance the IACR website (http://www.iacr.org), where URLs for different IACR conferences or workshops are hosted or mirrored. On this website is listed the email address IDC of the program chair. Meanwhile, it is common that the chair invites program committee members (the reviewers) who are experts in the field and for whom the chair knows the authentic email addresses IDR either himself, or for which he can ask from other experts through some out of band mechanism. Alternatively, the email addresses can be obtained from the IACR membership database. Thus, when a reviewer R receives an email from the chair C, it can check to be certain that the email came from the chair; vice versa a chair knows if an email came from a particular reviewer. This provides source authentication without resorting to any PKI. Then a general sketch of this kind of protocol proceeds as follows: 1. C generates an ID-based public and private key pair based on his IDC . Denote these as pkC and skC . Then C computes the message sigSKC (IDC , IDR ,Invite) where sigSK denotes signing under a person’s private key SK, and Invite contains a one-way (e.g. hashed) representation of a typical invitation email stating the conference name etc. 2. C sends m1 = sigSKC (IDC , IDR , Invite) to R. 3. R generates an ID-based public and private key pair based on his IDR . Denote these as pkR and skR . Then R computes the message sigSKR (IDR , IDC , Accept). 4. R sends m2 = sigSKR (IDR , IDC , Accept) to C. 5. C generates a password pwdR for R. 6. C sends m3 = 0002sigSKC (IDC , IDR , m1 , m2 , EncP KR (pwdR )), EncP KR (pwdR )0003 to R, where EncP K (·) denotes encryption under a person’s public key P K. 7. R obtains pwdR by decrypting EncP KR (pwdR ). In fact, this can be a concrete step to a long-term setting where one submission and review system is used for all conferences that use WSaR, so only a one-time setup cost e.g. the ID-based password distribution protocol as above, is incurred for new users, while existing users can update their passwords online through the system at any time; and new membership in conference program committees of existing users only require the program chair to email to a PC member R asking him to update his password himself rather than having to email newly generated passwords every time R is involved in a new conference program commitee. Indeed, this centralization is possible since conferences using WSaR are typically
260
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
hosted on a central machine e.g. at http://s1.iacr.org, unlike some other submission software that need to be locally set up. Furthermore, conferencesusing WSaR commonly involve program committee members who are involved in multiple conferences so the setup cost will be amortized over time and this centralization makes sense compared to treating each conference system separately. We emphasize here that the above is a sketch of a protocol design that we are currently working on, whose formal security we are in the process of proving. This should therefore preclude the sprouting of future papers that propose informal “breaks” on the above enumerated steps. The above sketch should only be taken as a basic template and taken for the general idea that it is sketching and nothing more. We do welcome comments for which will be acknowledged in future work, or collaborations in this direction.
5
Concluding Remarks
In this paper, we have seen that WSaR is built with a strong defense against a number of known attacks in the Web. We have also discovered and discussed the absence of several features that could jeopardise the security of WSaR users and we proposed suggestions to further enhance WSaR’s security that address these issues. As a side remark, we recommend that besides strengthening the defense of the software itself e.g. WSaR, Web administrators should secure the Web server as well as the back-end database and constantly monitor the activities going on in the Web server to detect any malicious behaviour. In addition, as a further step to securing these types of systems, we have also motivated the design of an email-based password distribution protocol for WSaR users and argued that together with such a design, there are advantages in centralizing conferences that use WSaR.
Acknowledgement The first author thanks Shai Halevi for encouragement during the initial stages of this research, for patiently answering queries about WSaR and for detailed comments on an early version of this paper. The second author thanks Thomas Baign`eres for stimulating discussions on another popular submission and review system iChair. Nous avons eu un temps merveilleux pendant la pause de caf`e, ou bien? We thank an anonymous referee for motivational comments especially on the need to make clear the distinction between pro- and counter-incentives for an attacker; and for acknowledging the fun of this research work :-)
References 1. AICT Security - Empty your Cache. Available online at https://www.ualberta.ca/AICT/Security/BrowserCache.html#private 2. Archer, T.: Are Hash Codes Unique? Available online at http://blogs.msdn.com/tomarcher/archive/2006/05/10/594204.aspx
On the Security of a Popular Web Submission
261
3. Bellare, M., Rogaway, P.: Entity Authentication and Key Distribution. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 232–249. Springer, Heidelberg (1994) 4. CIBC - Clear Your Browser’s Cache. Available online at http://www.cibc.com/ca/legal/clear-browsers-cache.html 5. Conklin, W.A., White, G.B., Cothren, C., Williams, D., Davis, R.L.: Principles of Computer Security: Security+ TM and Beyond. McGraw-Hill, New York (2005) 6. EasyChair Conference System. Available online at http://www.easychair.org/ 7. Foster, J.C.: Defense Tactics for SQL Injection Attacks. Available online at http://searchappsecurity.techtarget.com/tip/0,289483,sid92 gci1219912, 00.html 8. Fyre, C.: One Simple Rule to Make your Web Apps more Secure (2006), Available online at http://searchappsecurity.techtarget.com/qna/0,289202, sid92 gci1225425,gci1225425,00.html00.html 9. Google Mail. Available online at http://gmail.google.com 10. Halevi, S.: Web Submission and Review Software. Available online at http://theory.csail.mit.edu/∼ shaih/websubrev 11. IACR Conferences. Available online at http://www.iacr.org/conferences/ 12. McClure, S., Shah, S., Shah, S.: Web Hacking: Attacks and Defense. AddisonWesley, Reading (2003) 13. Microsoft Corporation. Microsoft’s Conference Management Toolkit. Available online at http://msrcmt.research.microsoft.com/cmt/ 14. Password Cracking: Information from Answers.com (2006), Available online at http://www.answers.com/topic/password-cracking 15. Peikari, C., Chuvakin, A.: Security Warrior. O’Reilly (2004) 16. Phan, R.C.-W., Goi, B.-M.: Flaw in IEEE Trans on Consumer Electronics Online Submission System. In: Dawson, E., Vaudenay, S. (eds.) Mycrypt 2005. LNCS, vol. 3715, Springer, Heidelberg (2005) 17. Phan, R.C.-W., Ling, H.-C.: On the Insecurity of the Microsoft Research Conference Management Tool (MSRCMT) System. In: CITA 2005. Proceedings of International Conference on IT in Asia, pp. 75–79 (2005) Also presented at the rump session of Asiacrypt 2004, Jeju Island, Korea 18. PHP Manual. Full version available online at http://www.php.net/manual/en/ 19. Regular Expressions (2006), Available online at http://searchappsecurity. techtarget.com/sDefinition/0,290660,sid92 gci517740,00.html 20. ScholarOne, Inc. Manuscript Central: About Manuscript Central. Available online at http://www.scholarone.com/products manuscriptcentral aboutMC.shtml 21. Security Information Clearing Browser Cache and History. Available online at http://www.hlasset.com/files/Clearing Cache History.pdf 22. SoftConf.com - Software for Conferences. Available online at http://www.softconf.com/index.html 23. SourceForge.net: Web Submission and Review Software. Available online at http://sourceforge.net/projects/websubrev 24. What is SQL Injection? (2006), Available online at http://searchappsecurity. techtarget.com/sDefinition/0,290660,sid92 gci1003024,00.html 25. The Ten Most Critical Web Application Security Vulnerabilities (2004) Available online at http://osdn.dl.sourceforge.net/sourceforge/owasp/OWASPTop Ten2004.pdf 26. Ware, M.: Online Submission and Peer-Review System (2005) Available online at www.zen34802.zen.co.uk/Learned Publishing offprint.pdf
262
A
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
Storage and Display of Submissions’ Digests
We present below the screen shots of the “submissions” table, and the submission/revision receipt after the calculation of submission’s digest is incorporated.
Fig. 14. Digest of submission is stored in the database for Chair’s reference
Fig. 15. Digest of submission is presented in the receipt for the author
B
Conferences That Have Used or Are Using WSaR
In reverse chronological order: 1. EUROCRYPT 2008: 27th Annual International Conference on the Theory and Applications of Cryptographic Techniques 2. CT-RSA 2008: RSA Conference 2007, Cryptographers’ Track 3. LATIN 2008: 8th Latin American Theoretical Informatics 4. TCC 2008: 5th Theory of Cryptography Conference 5. PKC 2008: 11th International Conference on Theory and Practice in PublicKey Cryptography 6. ASIACRYPT 2007: 13th Annual International Conference on the Theory and Application of Cryptology and Information Security 7. ISC 2007: 10th Information Security Conference 8. CRYPTO 2007: 27th Annual International Cryptology Conference 9. ICALP 2007: Track C of the 34th International Colloquium on Automata, Languages and Programming
On the Security of a Popular Web Submission
263
10. GOCP 2007: 1st International Workshop on Group-Oriented Cryptographic Protocols 11. ACNS 2007: 5th International Conference on Applied Cryptography and Network Security 12. EUROCRYPT 2007: 26th Annual International Conference on the Theory and Applications of Cryptographic Techniques 13. USEC 2007: Usable Security Workshop 14. TCC 2007: 4th Theory of Cryptography Conference 15. CT-RSA 2007: RSA Conference 2007, Cryptographers’ Track 16. HVC 2006: 2nd Annual Haifa Verification Conference 17. PKC 2006: 9th International Conference on Theory and Practice of PublicKey Cryptography
C
Related Work
Many online paper submission and review systems are emerging. For the context of cryptology and information security, the predecessor to WSaR is the collection of PHP/Perl scripts written progressively by Chanathip Namprempre, Andre Adelsbach, Andrew Clark and the Computer Security and Industrial Cryptography (COSIC) group at Katholieke Universiteit Leuven. These scripts were used for almost all mainstream cryptology and information security conferences till 2006 when building on ideas in these scripts, two successor systems were developed independently: WSaR by Shai Halevi of IBM Research and iChair by Thomas Baign`eres and Matthieu Finiasz of LASEC at EPFL. These two systems are now used by almost all mainstream cryptology and information security conferences. A few other major submission and review systems used in other fields deserve some mention here: 1. Manuscript Central Manuscript Central [20], developed by ScholarOne, Inc., is the online submission and peer review system used to handle manuscript submissions to journals. It manages over 44,000 submissions per month and its comprehensive and user-friendly features result in reports that most journals using Manuscript Central achieve gains in submissions of 20 to 40 per cent per annum. This system is currently used by most IEEE and Association of Computing Machinery (ACM) journals. Manuscript Central is a fullydeveloped software with 24-hour support on weekdays. By contacting the sales representative, one would be able to obtain and understand the system’s functionalities, features, pricing and get the system running in two weeks’ time. 2. Microsoft Research Conference Management Tool (MSRCMT) Firstly developed for ACM SIGKDD 1999, the MSRCMT [13] is an academic conference management service sponsored by Microsoft Research. Surajit Chaudhuri, a Research Area Manager at Microsoft Research is the architect of MSRCMT. Since the year 1999, this system has been used in over 500
264
S.-W. Lo, R.C.-W. Phan, and B.-M. Goi
conferences, among them are the International Conference on Security of Information and Networks (SIN 2007) and the ACM SIGCOMM 2007 Data Communication Festival. Similar to Manuscript Central, the MSRCMT is also a fully-developed system. It is free and hosted by Microsoft Research, but with limited support since it is developed and managed by a small team. 3. EasyChair Developed in the year 2002 by Andrei Voronkov, a Professor from the University of Manchester, EasyChair [6] is used by over 600 conferences in year 2007 alone. EasyChair is free and it is currently hosted by University of Manchester’s Computer Science Department. EasyChair is capable of supporting two models: (1) the standard model for conferences having one program committee and (2) the multi-track version for conferences having multiple tracks that have their own program committee. There are a number of ACM and IEEE conferences/workshops that have used or are using EasyChair. Among them are the 8th IEEE/ACM International Conference on Grid Computing (Grid 2007), the 2nd ACM Workshop on Scalable Trusted Computing (STC’07) and the 20th IEEE Computer Security Foundation Symposium (CSF 20). 4. START V2 ConferenceManager START V2 [22], written by Rich Gerber, is a product from SoftConf.com. Apart from EasyChair, several IEEE and ACM conferences have, or are employing START V2 as the submission and peer review system since the year 2002. The IEEE Symposium on Intelligence and Security Informatics (ISI 2007), the 2007 IEEE Symposium on Security and Privacy, the 5th ACM Workshop on Recurring Malcode (WORM 2007) and the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007) are among the many IEEE and ACM conferences using START V2. Users would need to contact the developer to order the system; the pricing depends on whether one needs the system to be hosted at softconf.com, and on the different types of licensing arrangements. This system is constantly improving based on the users’ feedback. It is worth to note two earlier work related to the security of online submission and review systems, although our work here is the first detailed analysis of this type of system. Phan and Goi [16] pointed out the lack of privacy in a system used by an IEEE Transactions where the URL of pages that disclose paper information and that allow paper revision for a particular submitted paper, differ from pages of other papers by an ID counter. Thus if the URL to revise author A’s submitted paper is uniquely identified by ID 100, then A can also view the revision page for the paper submitted right after (resp. before) his, which he knows will be uniquely identified by ID 101 (resp. 99). Correspondence with the administrator of the system obtained the response that this was not a significant issue. Phan and Ling [17] discovered by accident when submitting their paper to a conference using the Microsoft Research Conference Management Tool (MSRCMT) that the system automatically creates an account for co-authors of a
On the Security of a Popular Web Submission
265
corresponding author who submitted a paper, where email addresses of these co-authors are used as login usernames and the numeric 0 is used as the default password. This applied for any co-author(s) of any paper. Correspondence with developers of MSRCMT obtained the response that this was an exercise of regression testing, though the flaw was present for some months in the actual online system used by several conferences. As of 2005 it was verified that both the above systems no longer exhibit those issues.
Authorization Constraints Specification of RBAC Lilong Han, Qingtan Liu, and Zongkai Yang Department of Information and Technology&Engineering Research Center on Education Information Technology, Central China Normal Uninversity, 430079,Wuhan, China {Lilong Han,Qingtan Liu,Zongkai Yang,hanlilong2001}@yahoo.com.cn
Abstract. Constraints are an important aspect of role-based access control (RBAC) and are often regarded as one of the principle motivations behind RBAC. Although the importance of the constraints in RBAC has been recognized for a long time, they have not received much attention. In this article, we introduce an intuitive formal language for specifying role-based authorization constraints named RCL2000 including its basic elements, syntax and semantics. We show how previously identified role-based authorization constraints such as separation of duty (SOD) can be expressed in this language, and that there are other significant SOD properties that have not been previously identified in the literature. Our work indicates that there are many alternate formulations of even the simplest SOD properties, with varying degree of flexibility and assurance. So this language provides us a rigorous foundation for systematic study of role-based authorization constraints. Keywords: RBAC, Constraints, RCL2000, SOD, DSOD.
1 Introduction Role-based access control (RBAC) has emerged as a widely accepted alternative to classical discretionary and mandatory access controls. RBAC regulates the access of users to information and system resources on the basis of activities that users need to execute in the system, and requires the identification of roles in the system. Since roles in an organization are relatively persistent with respect to user turnover and task reassignment, RBAC provides a powerful mechanism for reducing the complexity, cost, and potential for error in assigning permissions to users within the organization. Because roles within an organization typically have overlapping permissions, RBAC models include features to establish role hierarchies, where a given role can include all of the permissions of another role. Another fundamental aspect of RBAC is authorization constraints (also simply called constraints). Although the importance of constraints in RBAC has been recognized for a long time [1], they have not received much attention in the research literature, while role hierarchies have been practiced and discussed at considerable length. In this article, our focus is on constraint specifications, that is, on how constraints can be expressed, whether in natural languages, such as English, or in more formal languages. Natural language specification has the advantage of ease of comprehension by human beings, but may be prone to ambiguities, and the specifications do not lend themselves to the analysis of properties of the set of constraints. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 266–276, 2007. © Springer-Verlag Berlin Heidelberg 2007
Authorization Constraints Specification of RBAC
267
To specify these constraints we introduce the specification language RCL2000 (for Role-Based Constraints Language 2000, pronounced Ríckle2000) which is the specification language for role-based authorization constraints [2]. In this article, we describe its basic elements, syntax, and the formal foundation of RCL2000. RCL2000 is a substantial generalization of RSL99 [Ahn and Sandhu 1999], which is the earlier version of RCL2000. It encompasses obligation constraints in addition to the usual separation of duty and prohibition constraints.
2 Role-Based Constraints Language (RCL 2000) RCL2000 is defined in context of the well-known family of models for RBAC of Sandhu et al. This model has become a widely cited authoritative reference and is the basis of a standard currently under development by the National Institute of Standards and Technology. Here we use a slightly augmented form of the model illustrated in Figure 1. We decompose permissions into operations and objects to enable formulation of certain forms of constraints. Also in Figure 1 we drop the administrative roles since they are not germane to RCL2000.
Fig. 1. Basic elements and system functions
Constraints are an important aspect of role-based access control and are a powerful mechanism for laying out higher-level organizational policy. The importance of flexible constraints to support emerging applications has been recently discussed by many scholars. Consequently, the specification of constraints needs to be considered. To date, this topic has not received much formal attention in the context of role-based access control. A notable exception is the work of Giuri and Iglio who defined a formal model for constraints on role-activation. RCL2000 considers all aspects of role-based constraints, not just those applying to role activation. RCL2000 goes beyond separation of duty to include obligation constraints such as those used in the constructions of Sandhu and Osborn et al. for simulating mandatory and discretionary access controls in RBAC.
268
L. Han, Q. Liu, and Z. Yang
One of our central claims is that it is futile to try to enumerate all interesting and practically useful constraints because there are too many possibilities and variations. Instead, we should pursue an intuitively simple yet rigorous language for specifying constraints such as RCL2000. The expressive power of RCL2000 is demonstrated in Section 4, where it is shown that many constraints previously identified in the RBAC literature and many new ones can be conveniently formulated in RCL2000. 2.1 Basic Components The basic elements and system functions on which RCL2000 is based are defined in Figure 2. Figure 1 shows the RBAC model which is the context for these definitions. RCL2000 has six entity sets called users (U), roles (R), objects (OBJ), operations (OP), permissions (P), and sessions (S). These are interpreted as in RBAC model as discussed above. OBJ and OP are not in RBAC model. OBJ is the passive entities that
… …
—U=a set of users,{u1, ,un} —R=a set of roles,{r1, ,rm} —OP=a set of operations,{op1, ,opo} —OBJ=a set of objects,{obj1, ,objr} J a set of permissions,{p1, —P=OP —S=a set of sessions,{s1, ,sr}
…
… …
…,p } q
—RH R Ris a partial order on R called the role hierarchy or role dominance relation,written as ≤ —UA U R, a many-to-many user-to-role assignment relation —PA P R=OP OBJ R, a many-to-many permission-to-role assignment relation —user :S U,a function mapping each session si to the single user. user: R U, a function mapping each role ri to a set of users. r , a function mapping the set U,P and S to a set of roles R. —roles :U P S roles*: U P S r, extends roles in presence of role hierarchy.
roles(ui)={r R (ui, r) U roles*(ui)={r R ( r ≥r)[(ui, r ) U roles(pi)={r R (pi, r) P roles*(pi)={r R ( R ≤r)[(pi, r ) P roles(si)={r R (sessions(si),r) U roles*(si)={r R ( R ≥r)[ r roles(si) —sessions: U 2s, a function mapping each user ui to a set of sessions
’ ’ ’ ’ ’ ’
—permissions :R p, a function mapping each role ri to a set of permissions. p , extends permissions in presence of role hierarchy. —permissions*: R permissions(ri)={p P (p,ri) PA} permissions*(ri)={p P ( r≤ri)[(p,ri) PA]} —Operations: R OBJ 2OP, a function mapping each role ri and object obji to a set of operations Operations(ri,obji) ={op OP (op,obji,ri) PA} OBI , a function mapping each permissions pi to a set of objects —object: P
Fig. 2. Basic elements and system functions
Authorization Constraints Specification of RBAC
269
contain or receive information. OP is an executable image of a program, which upon execution causes information flow between objects. P is an approval of a particular mode of operation to one or more objects in the system. 2.2 Additional Elements Additional elements and system functions used in RCL2000 are defined in Figure 3. The precise meaning of conflicting roles, permissions, and users will be specified as per organizational policy in RCL2000. For mutually disjoint organizational roles such as those of purchasing manager and accounts payable manager, the same individual is generally not permitted to belong to both roles. We defined these mutually disjoint roles as conflicting roles. We assume that there is a collection CR of sets of roles that have been defined as conflicting.
Fig. 3. Additional elements and nondeterministic functions
The concept of conflicting permissions defines conflict in terms of permissions rather than roles. Thus the permission to issue purchase orders and the permission to issue payments are conflicting, irrespective of the roles to which they are assigned. We denote sets of conflicting permissions as CP. As we show, defining conflict in terms of permissions offers greater assurance than defining it in terms of roles. Conflict defined in terms of roles allows conflicting permissions to be assigned to the same role by error (or malice). Conflict defined in terms of permissions eliminates this possibility. In the real world, conflicting users also should be considered. For example, for the process of preparing and approving purchase orders, it might be company policy that members of the same family should not prepare the purchase order, and also be a user who approves that order. RCL2000 has two nondeterministic functions, oneelement and allother. The oneelement(X) function allows us to get one element Xi from set X. We usually write oneelement as OE. Multiple occurrences of OE(X) in a single RCL2000 statement all select the same element Xi from X. With allother(X) we can get a set by taking out one element. We usually write allother as AO. These two nondeterministic functions are related by context, because for any set S, {OE(S)} U AO=S, and at the same time, neither is a deterministic function. In order to illustrate how to use these two functions to specify role-based constraints, we take the requirement of the static separation of duty (SOD) property which is the simplest variation of SOD [3]. For simplicity assume there is no role
270
L. Han, Q. Liu, and Z. Yang
hierarchy (otherwise replace roles by roles*). Requirement: No user can be assigned to two conflicting roles. In other words, conflicting roles cannot have common users. We can express this requirement as below. Expression: roles (OE (U)) I OE(CR) ≤ 1 OE(CR) means a conflicting role set and the function roles(OE(U)) returns all roles that are assigned to a single user OE(U). Therefore this statement ensures that a single user cannot have more than one conflicting role from the specific role set OE(CR). We can interpret the above expression as saying that if a user has been assigned to one conflicting role, that user cannot be assigned to any other conflicting role. We can also specify this property in many different ways using RCL2000, such as OE(OE (CR)) ∈ Roles(OE(U)) ⇒ AO(OE(CR)) I roles(OE(U))= φ or user(OE(OE(CR)))
I User(AO(OE(CR)))= φ .
The expression roles (OE (U)) I OE(CR) ≤ 1 specifies dynamic separation of duties (DSOD) applied to active roles in a single session as opposed to static separation applied to user-role assignment. Dynamic separation applied to all sessions of a user is expressed by roles(sessions(OE (U))) I OE(CR) ≤ 1. A permission-centric formulation of separation of duty is specified as roles(OE(OE(CP))) I roles(AO(OE(CP)))= φ .The expression roles(OE(OE(CP))) means all roles that have a conflicting permission from, say cpi, and roles(AO(OE(CP))) stands for all roles that have other conflicting permissions from the same conflicting permission set cpi. This formulation leaves open the particular roles to which conflicting permissions are assigned but requires that they be distinct. This is just a sampling of the expressive power of RCL2000 discussed in Section 4. 2.3 Syntax of RCL 2000 The syntax of RCL 2000 is defined by the syntax diagram and grammar given in Figure 4. The rules take the form of flow diagrams. The possible paths represent the possible sequence of symbols. Starting at the beginning of a diagram, a path is followed either by transferring to another diagram if a rectangle is reached or by reading a basic symbol contained in a circle. Backus Normal Form (BNF) is also used to describe the grammar of RCL2000 as shown in the bottom of Figure 4. The symbols of this form are “::=” meaning “is defined as” and “ ” meaning “or.” Figure 4 shows that RCL2000 statements consist of an expression possibly followed by implication ( ⇒ ) and another expression. Also RCL2000 statements can be recursively combined with a logical AND operator ( ∧ ). Each expression consists of a token followed by a comparison operator and token, size, set, or set with cardinality. Also a token itself can be an expression. Each token can be just a term or a term with cardinality. Each term consists of functions and sets including set operators. The sets and system functions described earlier in Section 2.1 are allowed in this syntax. Also, we denote oneelement and allother as OE and AO, respectively.
Authorization Constraints Specification of RBAC
271
Fig. 4. Syntax of language
3 Formal Semantics of RCL2000 In this section, the formal semantics for RCL2000 is discussed. We do so by identifying a restricted form of first-order predicate logic called RFOPL which is exactly equivalent to RCL2000. Any property written in RCL2000, called a RCL2000 expression, can be translated to an equivalent expression in RFOPL and vice versa. The translation algorithm, namely, Reduction, converts a RCL2000 expression to an equivalent RFOPL expression. The Reduction algorithm eliminates AO function(s) from a RCL2000 expression in the first step. Then we translate OE terms iteratively introducing universal quantifiers from left to right. If we have nested OE functions in the RCL2000 expression, translation will start from the innermost OE terms. This algorithm translates the RCL2000 expression to an RFOPL expression in time O(n), supposing that the number of OE terms is n.
272
L. Han, Q. Liu, and Z. Yang
For example, the following expression can be converted to an RFOPL expression according to the sequences below. Example 1. OE(OE(CR)) ∈ roles(OE(U)) ⇒ AO(OE(CR)) I roles(OE(U))= φ
(1)OE(OE(CR)) ∈ roles(OE(U) ⇒ (OE(CR)-{OE(OE(CR))} I roles(OE(U))= φ
(2) ∀ cr ∈ CR : OE(cr) ∈ roles(OE(U)) ⇒ (cr-{OE(cr)}) I roles(OE(U))= φ (3) ∀ cr ∈ CR, ∀ r ∈ cr : r ∈ roles(OE(U)) ⇒ (cr-{r}) I roles(OE(U))= φ (4) ∀ cr ∈ CR, ∀ r ∈ cr, ∀ u ∈ U : r ∈ roles(u) ⇒ (cr-{r}) I roles(u) = φ
The resulting RFOPL expression will have the following general structure. (1) The RFOPL expression has a (possibly empty) sequence of universal quantifiers as a left prefix, and these are the only quantifiers it can have. We call this sequence the quantifier part. (2) The quantifier part will be followed by a predicate separated by a colon(:) (i.e., universal quantifier part : predicate). (3) The predicate has no free variables or constant symbols. All variables are declared in the quantifier part (e.g., ∀ r ∈ R, ∀ u ∈ U: r ∈ roles(u)). (4) The order of quantifiers is determined by the sequence of OE elimination. In some cases this order is important so as to reflect the nesting of OE terms in the RCL2000 expression. For example, in ∀ cr ∈ CR, ∀ r ∈ cr, ∀ u ∈ U: predicate; the set cr, which is used in the second quantifier, must be declared in a previous quantifier as an element, such as cr in the first quantifier. (5) Predicate follows most rules in the syntax of RCL2000 except the term syntax in Figure 4. Because the reduction algorithm has a nondeterministic choice for reduction of the OE term, we may have several RFOPL expressions that are translated from a RCL2000 expression.
4 Expressive Power of RCL2000 In this section, we demonstrate the expressive power of RCL2000 by showing how it can be used to express a variety of separation of duty properties. As a security principle, SOD is a fundamental technique for prevention of fraud and errors, known and practiced long before the existence of computers [4]. It is used to formulate multiuser control policies, requiring that two or more different users be responsible for the completion of a transaction or set of related transactions. The purpose of this principle is to minimize fraud by spreading the responsibility and authority for an action or task over multiple users, thereby raising the risk involved in committing a fraudulent act by requiring the involvement of more than one individual. A frequently used example is the process of preparing and approving purchase orders. If a single individual prepares and approves purchase orders, it is easy and tempting to prepare and approve a false order and pocket the money. If different users must prepare and approve orders, then committing fraud requires a conspiracy of at least two, which significantly raises the risk of disclosure and capture.
Authorization Constraints Specification of RBAC
273
Although separation of duty is easy to motivate and understand intuitively, so far there is no formal basis for expressing this principle in computer security systems. Several definitions of SOD have been given in the literature. For the purpose of this article, we use the following definition. Role-based separation of duty ensures SOD requirements in role-based systems by controlling membership in, activation of, and use of roles as well as permission assignment. There are several papers in the literature over the past decades that deal with separation of duty. During this period various forms of SOD have been identified. Attempts have been made to systematically categorize these definitions. However, this work has significant limitations. It omits important forms of SOD including session-based dynamic SOD needed for simulating lattice-based access control and Chinese Walls in RBAC. It also does not deal with SOD in the presence of role hierarchies. Moreover, as shown, there are additional SOD properties that have not been identified in the previous literature. Here, we take a different approach to understanding SOD. Rather than simply enumerating different kinds of SOD we show how RCL2000 can be used to specify the various separation of duty properties. 4.1 Static SOD Static SOD (SSOD) is the simplest variation of SOD. In Table 1, we show our expression of several forms of SSOD. These include new forms of SSOD that have not previously been identified in the literature. This demonstrates how RCL2000 helps us in understanding SOD and discovering new basic forms of it. Property 1 is the most straightforward property. The SSOD requirement is that no user should be assigned to two roles which are in conflict with each other. In other words, it means that conflicting roles cannot have common users. RCL2000 can clearly express this property, which is the classic formulation of SSOD. It is a rolecentric property. Property 2 follows the same intuition as Property 1, but is permission-centric. Property 2 says that a user can have at most one conflicting permission acquired through roles assigned to the user. Property 2 is a stronger formulation than Property 1, which prevents mistakes in role permission assignment. This kind of property has not been previously mentioned before [5]. RCL2000 helps us discover such omissions in previous work. In retrospect, Property 2 is an “obvious property” but there is no mention of it in over a decade of SOD literature. Even though Property 2 allows more flexibility in role-permission assignment since the conflicting roles are not predefined, it can also generate roles that cannot be used at all. For example, two conflicting permissions can be assigned to a role. Property 2 simply requires that no user can be assigned to such a role or any role senior to it, which makes that role quite useless. Thus, Property 2 prevents certain kinds of mistakes in role-permissions but tolerates others. Property 3 eliminates the possibility of useless roles with an extra condition, permissions*(OE(R)) I OE(CP) ≤ 1. This condition ensures that each role can have at most one conflicting permission without consideration of user-role assignment. Property 4 can be viewed as a reformulation of Property 3 in a role-centric manner. Property 3 does not stipulate a concept of conflicting roles. However, we can interpret conflicting roles to be those that happen to have conflicting permissions assigned to
274
L. Han, Q. Liu, and Z. Yang
them. Thus, for every cpi, we can define cri ={r ∈ R cpi I permissions(R) ≠ φ }. With this interpretation, Properties 3 and 4 are essentially identical. The viewpoint of Property 3 is that conflicting permissions get assigned to distinct roles which thereby become conflicting, and therefore cannot be assigned to the same user. Which roles are deemed conflicting is not determined a priori but is a side-effect of permission-role assignment. The viewpoint of Property 4 is that conflicting roles are designated in advance and conflicting permissions must be restricted to conflicting roles. These properties have different consequences on how roles get designed and managed but essentially achieve the same objective with respect to separation of conflicting permissions. Both properties achieve this goal with much higher assurance than Property 1. Property 2 achieves this goal with similar high assurance but allows for the possibility of useless roles. Thus, even in the simple situation of static SOD, we have a number of alternative formulations offering different degrees of assurance and flexibility. Table 1. Static separation of duty Properties 1.SOOD-CR 2.SOOD-CP 3.Variation of 2 4.Variation of 1
5.SOOD-CU 6.Yet another variation
Expressions
roles*(OE(U)) I OE(CR) ≤ 1
permissions(roles*(OE(U))) I OE(CP) ≤ 1 (2) ∧ permissions*(OE(R)) I OE(CP) ≤ 1 (1) ∧ permissions*(OE(R)) I OE(CP) ≤ 1
∧ permissions(OE(R)) I OE(CP) ≠ φ ⇒ OE(R) I OE(CR) ≠ φ (1) ∧ user(OE(CR)) I OE(CU) ≤ 1 (4) ∧ (5)
Property 5 is a very different property and is also new to the literature. With a notion of conflicting users, we identify new forms of SSOD. Property 5 says that two conflicting users cannot be assigned to roles in the same conflicting role set. This property is useful because it is much easier to commit fraud if two conflicting users can have different conflicting roles in the same conflicting role set. This property prevents this kind of situation in role-based systems. A collection of conflicting users is less trustworthy than a collection of non-conflicting users, and therefore should not be mixed up in the same conflicting role set. This property has not been previously identified in the literature. We also identify a composite property that includes conflicting users, roles, and permissions. Property 6 combines Properties 4 and 5 so that conflicting users cannot have conflicting roles from the same conflict set while ensuring that conflicting roles have at most one conflicting permission from each conflicting permission set. This property supports SSOD in user-role and role-permission assignment with respect to conflicting users, roles, and permissions.
Authorization Constraints Specification of RBAC
275
4.2 Dynamic SOD In RBAC systems, a dynamic SOD property with respect to the roles activated by the users requires that no user can activate two conflicting roles. In other words, conflicting roles may have common users but users can not simultaneously activate roles that are in conflict with each other. From this requirement, we can express user-based Dynamic SOD as Property 1. We can also identify a session-based DSOD property that can apply to the single session as Property 2. We can also consider these properties with conflicting users such as Properties 1-1 and 2-1. Additional analysis of DSOD properties based on conflicting permissions can also be pursued as was done for SSOD. Table 2. Dynamic separation of duty Properties 1.Used-based DSOD 1-1.Used-based DSOD with CU
Expressions Roles*(sessions(OE(U))) I OE(CR) ≤ 1
Roles*(sessions(OE(OE(CU)))) I OE(CR) ≤ 1
Roles*(OE(sessions(OE(U)))) I OE(CR) ≤ 1 2-2.Session-based DSOD with CU Roles*(OE(sessions(OE(OE(CU))))) I OE(CR) ≤ 1 2.Session-based DSOD
5 Conclusion In this article, we have described the specification language RCL2000. This language is built on RBAC components and has two nondeterministic functions OE and AO. We have given a formal syntax and semantics for RCL2000. Any property written in RCL2000 may be translated to an expression written in a restricted form of first order predicate logic, which we call RFOPL. There is room for much additional work with RCL2000 and similar specification languages [6]. The language can be extended by introducing time and state. Analysis of RCL2000 specifications and their composition can be studied. The efficient enforcement of these constraints can also be investigated. A user-friendly front-end to the language can be developed so that it can be realistically used by security policy designers. Acknowledgments. This work was supported by National Science Foundation under Grant No:60673010 and supported by the Natural Science Foundation of Hubei Province under Grant No:2006ABC011 and supported by National Great Project of Scientific and Technical Supporting Programs Funded by Ministry of Science & Technology of China During the 11th Five-year Plan under Grant No: 2006BAH02A24.
References 1. Chen, F., Sandhu, R.S.: Constraints for Role-based Access Control. In: Proceedings of the First ACM Workshop on Role-Based Access Control, pp. 39–46. ACM Press, New York (1995) 2. Ahn, G.J., Sandh, R.: The RSL99 Language for Role-based Separation of Duty Constraints. In: Proceedings of 4th ACM Workshop on Role-Based Access Control, pp. 43–54. ACM Press, New York (1999)
276
L. Han, Q. Liu, and Z. Yang
3. Giuri, L., Iglio, P.: A Formal Model for Role-based Access Control with Constraints. In: Proceedings of 9th IEEE Workshop on Computer Security Foundations, pp. 136–145. IEEE Press, Piscataway, NJ (1996) 4. Gligor, V.D., Gavrila, S., Ferraiolo, D.: On the Formal Definition of Separation-of-duty Policies and Their Composition. In: Proceedings of the 1998 IEEE Computer Society Symposium on Research in Security and Privacy, pp. 172–183. IEEE Computer Society Press, Los Alamitos, CA (1998) 5. Jaeger, T.: On the Increasing Importance of Constraints. In: Proceedings of 4th ACM Workshop on Role-Based Access Control, pp. 33–42. ACM Press, New York (1999) 6. Osborn, S., Sandhu, R., Munawer, Q.: Configuring Role-based Access Control to Enforce Mandatory and Discretionary Access Control Policies. ACM Trans. Inf. Syst. Secur. 3(2) (2000)
Dynamic Access Control Research for Inter-operation in Multi-domain Environment Based on Risk0002 Zhuo Tang, Ruixuan Li, Zhengding Lu, and Zhumu Wen School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China hust [email protected], {rxli,zdlu}@hust.edu.cn, [email protected]
Abstract. For the complexity of the multi-domain environment and the ceaseless evolvement of the information secure sharing, the traditional access control method can not ensure the absolute security for the exchange of data resources. Through introducing the concept of risk, this paper proposes a dynamic access control model for multi-domain environment based on risk of inter-operations. The risk rank of an access policy can be calculated by the history of the inter-operations among domains, the security degree of the objects and the safety factor of the access events. Through adjusting the access policies which be considered the high risk, the risk in the system can be controlled in real time. The security analysis shows that this method can reinforce the facility of the access control and the security of the multi-domain environment.
1
Introduction
With the increase in information and data accessibility, there is a growing concern for security and privacy of data. The realization of the connections and the inter-operations among the different data sources under the distributed heterogeneous environment is becoming the practical problem. There is a growing concern for the security problem for the inter-operations between multi-domains. More and more researches try to resolve the defense for the vicious behaviors through the economic methods. In fact, recent research in these directions has suggested some economical models for a wide range of secure distributed systems, including a payment based security system for mobile agents [1] and game based model for secured grid computing [2]. However, risk remains un-quantified in these proposals. In fact, there exists an emerging consensus that every security question is indeed an economical question concerning the utility of the underlying system [3]. For example, it is easy to show that in a mobile agent based e-commerce system, both the protection of agents and hosts have a direct impact 0002
This work is partially supported by National Natural Science Foundation of China under Grant 60403027, Natural Science Foundation of Hubei Province under Grant 2005ABA258, Open Foundation of State Key Laboratory of Software Engineering under Grant SKLSE05-07.
S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 277–290, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
278
Z. Tang et al.
on the utility: attacks on a host by malicious agents will cause loss of commercial secrets such as customers private information, downtime to the system, loss of customers, which will eventually be counted as utility loss. Attacks on agents will result in similar consequences that will also lead to the lost of utility. Thus utility maximization and risk minimization are important issues in the design of a secure distributed system if we seek to gain maximum economical benefits from the underlying system. This is also an important target of the distributed system security. But at present, there are little literatures to demonstrate the actual signification of the risk of distributed system [4]. In the economic area, there exist the consanguineous relations between the risk and trust. Mayer and Rousseau [5] discussed the difference and the relationship between the risk and trust. Jarvenpaa [6] proposed definitely: trust can influence the risk in a certain extent. Further more, it can influence the subjects’ behaviors. For example, there is the lower risk when you loan money to the familiar than stranger. Moreover, this literature considers that the extent of trust can influence the cognitive extent of the risk. The objective of the inter-operations is to offer the rational distributing and effective share of the resource, and the cooperation of the distributed systems. It means the ability of the two software components to communicate and cooperate to complete a common task. It contains two meanings: the basic and the application. The basic inter-operations mean the communications and cooperation among the different platforms. And the applied inter-operations mean the cooperation among the distributed application components which above the computational platforms. This paper mainly discusses the later. The multi-domain environment has the characteristic of dynamic and indetermination. As the frequent changes of the security policies in individual autonomy, the changes of the relationship among the domains, even the birth and the death of the individual domains, the any security polices can not insure the absolute security of the data resource in the process of inter-operations. For the access control method of the traditional model, the subject sometimes can use its permissions constantly once been authorized. It hardly satisfies the dynamic changes of the multi-domain environment. And it will bring many security risks and hidden trouble to the inter-operations among multi-domain environment. In order to decrease the risk of inter-operations, for the problems of trust and risk of the security inter-operations under the application tier, which base the mappings between the users and roles in the different domains, this paper proposes a risk based dynamic access control model for multi-domain environment. In this model, through calculating the trust degree between the subjects firstly, we can ascertain the risk extent when a subject in one domain has an operation to the objects in the other domains. Therefore, we can receive the risk degree of the inter-operations. Using this risk degree, we can adjust the subjects’ access privilege dynamically. The risky permissions will be revoked, and the utility of the system will be maximized. The rest of the paper is organized as follows. Section 2 describes the related works. Section 3 presents algorithms for the calculating the risk of the
Dynamic Access Control Research
279
inter-operations. Section 4 describes the dynamic access control model for multidomain environment, followed by the conclusion in Section 5.
2
Related Works
In the recently 20 years, people have acquired the plentiful achievement for the research of the access control. Many access control models have been proposed. The most popular models include discretionary access control (DAC), mandatory access controls (MAC) and role based access control (RBAC). In the RBAC family which be proposed by Sandhu in 1996[7], the users’ privilege is related with their roles, and the users acquire their privilege through roles. A role is a permission set for a special work station. When the users’ privilege needs to be changed, we can do it by revoking the roles or re-distributing the user’s roles. Michael J. Covington et al [8] have proposed the Generalized Role Based Access Control (GRBAC) model. In this model, they extend the traditional RBAC by applying the roles to all the entities in a system. (In RBAC, the role concept is only used for subjects). By defining three types of roles, i.e., Subject roles, Environment roles, and Object roles, GRBAC uses context information as a factor to make access decisions. Guangsen Zhang et al. [9] also uses context parameters in their dynamic role-based access control model under the two key ideas: (1) A user‘s access privileges should be changed when the user’s context changes. (2) A resource must adjust its access policy when the environment context information (e.g., network bandwidth, CPU usage, memory usage) changes. These above two papers make the access control dynamic and flexible but the decision-making process is not as powerful and precise as that in our model. They did not consider the aspect of security in making-decision process and the impact of security problems on the system. The Nathan Dimmock’s paper [10] uses the concept of outcome to calculate cost for each outcome and risk value. Comparing to this paper, they do not consider the context for risk assessment. So it loses the flexibility characteristic in evaluating risk. They did not consider risk as an important factor in their access control mechanism and they did not use risk directly in making decision. There is little attention to the trust and risk in the access control research [11]. The term trust management system was introduced by Blaze et al. in [12], but the solution it proposes involves an unduly static notion of trust application programmers choose where to insert code to evaluate their notion of trust, for example at the starting point of a given execution session. Most of the past research combining access control with trust concepts focuses on a trust-management approach in which trust values flow in a manually defined way through access control policy. For example, in literature [13] and [14], the mutual trust relationship is founded by the continuously negotiation. Literature [15] illuminates the relationship between the trust management and distributed access control, and it extends the access control system of OASIS and the access control language. So, the access policy can be decided base on the trust and risk. But the trust mentioned in this paper is defined by the special operation, and the relationship between risk and trust is faint in this paper.
280
Z. Tang et al.
The above access control methods mostly base the traditional model. They are all short of the dynamic description for the subjects. With the complexity of the system and the dynamics of the applications, the changes of the access control objects are very large. Hence, these methods may increase the difficulty of the authorization. These access control models all try to protect the resource from the perspective of system. The weakness of these passive security models is that they cannot manage the privilege according to the environment dynamically. Once the subject acquire the privilege, it can use this privilege until it be revoked. It can bring the risk easily. Compared with the traditional RBAC model, the paper’s main contributions are as follows: 1. Introducing of a concept of risk into access control area. This method can ascertain the risk of inter-operations between the different domains in real time through the histories of the interactive events. It is better able to adapt to the distributed, complex, and diverse multi-domain environment. 2. Through adjusting the privilege of the subjects dynamically according to the risk levels of the access events, the functions of the access control system can be changed from the static protect for the resources to the dynamic authorization. The system can detect the environment and the security venture in real time, and the permissions of the subjects are not unchangeable anymore since be authorized. The system can identify the risky permissions automatically and revoke them duly. 3. This method can bring the convenience to the security management. The difference between this dynamic model and the traditional access control models is as follows: The management for the user’s permission settings is according to the actual events and historical records. In this way, the permission management is more convenient, and the control to the authorization is more convincing.
3
The Risk of the Inter-operation in Multi-domains
In the traditional model of the trust relationship, trust was usually defined as a Boolean variable, that is to say, in the session of both trust entities, one trust another entirely, or absolutely not, there would never be middle status. For instance, the entity A trusts entity B, but it is hard to tell how much they trust each other. For this reason, we have to quantify their trust. In this section, firstly, we formalize the definitions of the permissions and the operations between the permissions in the multi-domain environment. On this basis, we introduce the description of the trust in multi-domains. 3.1
The Formalization of the Permissions
Definition 1. Authorization Term. Authorization terms are 2-tuple of the form: < object, accessmode >, which is denoted as < O, A > for short. It is the basic form of the permission. The set of authorization terms is denoted as P . We have P = {< O, A >}.
Dynamic Access Control Research
281
Definition 2. Permission Set. Permission set represents all permissions of some subject, which is the set of the authorization terms. We can formulize it as PS. For example, we can describe a role r1 ’s all permission as: P S(r1 ) = {< f ile1, +read >, < f ile2, −write >}. That is to say the users, which are assigned to r1 , can read file1 and write file2. The denotation P Su can also express the permission set of the user u. obviously, if a role r is assigned to a user u, then P S(u) ⊇ P S(r). In this paper, the denotation role(u) represents the role set, which is assigned to user u. We can define the basic operators of the PS. The BNF definition for permission set as follows: P S = P S P S ∪ P S P S ∩ P S P S − P S SoD(P S, P S) Where the ∪,∩ and − are the basic operation in set theory, SoD(P S1 (r), P S2 (r)) denotes the separation of duties, it returns P S1 (r) or P S2 (r) , but it can return the P S1 (r) and P S2 (r) concurrently. OS = {O/D} returns the controlled objects set for a subject. D denotes the object’s domain. For instance OS(r1 ) = {f ile1/A, f ile2/A}. 3.2
The Role-Mappings Based Trust Degree Between Domains
In a typical multi-domain environment, we partition the domains into external and local domains. The role mapping can be formalized as a 4-tuple: < r1 , d1 , r2 , d2 >, r1 is a role in domain d1 , r2 is a role in domain d2 respectively, in general, d1 is the local domain, and d2 is the external domain. A subject in the local domain can access the objects in the external through the inter-domain mappings. As the mapping exhibits, the permissions of the role r1 in the domain d1 is P S(r1 ) = P S(r1 ) ∪ P S(r2 ) and OS(r1 ) = OS(r1 ) ∪ OS(r2 ). Trust is one entity assessing to behavior credibility, reliability, integrity and performance of other entity. Trust relationship is such a case: if the subject meets the object’s expectation, then the subject is trustable to the object. Definition 3. The trust degree denotes the trust extent between the different domains which be formed by the role-mappings. Which is formalized: C (d1 , d2 ), depict the trust relationship between the domain d1 and d2 . As its value range
rB2
rA3
rA1
rA2 rB1
Domain A
Domain B
Fig. 1. The role-mappings between two domains
282
Z. Tang et al.
is [0, 1], supposing a role-mapping < r1 , d1 , r2 , d1 , f alse >, C(d1 , d2 ) = 1 means the complete trust, that is to say the all permissions of the role r2 in the domain d2 can be inherited by the role r1 in the domain d1 ; by contraries, C(d1 , d2 ) = 0 means the complete distrust, that is to say the all permissions of the role r2 in the domain d2 are forbidden to be inherited by the role r1 in the domain d1 . The trust degree between domains is changed according to the inter-operation events. This is denoted by the function as follows: δ :C ×E →C
(1)
Where, E denotes the set of the inter-operation events. In general, if there are role-mappings exist between two domains, for each inter-operation, if the result is successful, the trust degree will be strengthened; by contraries, if the result is failed, it will be weakened. The following subsection discusses how to found the trust relationship between two domains. In this paper, we consider the trust in the multi-domain environment through their past transaction experiences. Considering the interoperations between the domain i and j, if the subject in domain i request to access the resources in domain j, according the estimate of each event, if the request is be satisfied, then the estimate from i to j is positive, that is denoted as trk (i, j) = 1, by contraries, if the estimate is passive, then trk (i, j) = −1. This paper defines the denotation sij as 0002 the appraisement for the all transactions between the domain i and j: sij = (tr( ij). Let’s use sat(i, j) to denote the total of the positive appraisement, while the denotation unsat(i, j) is used to denote the total of the passive appraisement. Then, sij = sat(i, j) − unsat(i, j)
(2)
For the convenience of the denotation, we mapping the value of the trust value to [0, 1]: 0003 sij sij ≥ 0 C(i, j) = sat(i,j)+unsat(i,j) (3) 0 otherwise In this way, every domain maintains local trust degrees of the other domains which the local ever affiliates with. We can use a vector to describe the trust of a local domain to the externals: T = {C(i, 1), c(i, 2), . . . , C(i, n)}, n is the number of the external domains. 3.3
The Risk of the Inter-operations in Multi-domain Environment
In general, a domain can maintain the trust vector for the externals domains which it interact with. As mentioned above, in the multi-domain environment, the risk of the inter-operations is up to the trust value of the domains C(i, j), the security level of the operation object Os , and the safety factor of the access action As . The following function defines the risk of the interoperation between the different domains:
Dynamic Access Control Research
283
Definition 4. Ri = F (C(i, j), Os , As ). The parameter Os denotes the security level of the operation object, the more high level the role, the more security level the objects can be accessed. The As denotes the security extent of the access operation. In general, the risk of the inter-operations will be increased with the heightener of the security level of the objects. Reversely, it will be decreased with the heightener of the trust between the interrelated domains and the safety factor of the access action. The function of the risk for the inter-operations is defined as follows: F (C(i, j), Os , As ) = Os × (1 − C(i, j)) × (1 − As )
(4)
By (1), we can see the value of Ri is in the range of [0, 1]. Where, the value 0 denotes no risk, and the value 1 denotes the maximal risk. The following is the algorithm for the security level of all operation objects in the special domain. The basic idea is that the leaf nodes in a role hierarchy only access the objects with the lowest security level in a special domain. That is to say, if the objects can be only accessed by the senior role, their security level is higher in the domain. The detailed algorithm is as follows. The parameter k is the basic security parameter in a special domain. k is an integers. k ≥1. Algorithm 1. The calculation for the security level of the access control objects. program Obj Security level() begin 1. Searching the role hierarchy, find the deepest leaf nodes. 2. Setting the value of the security level of the objects which can be controlled by the leaf node as k. While, the “visited” flags of the nodes are modified as “already visited”. 3. Finding the directly senior up from the leaf node, the security level of the objects which directly under the next senior node is on the basis of an increase for the objects controlled by the directly junior nodes. 4. Searching the all unvisited nodes down from the root node, the security level of the objects which directly under the next junior node is on the basis of a decrease for the objects controlled by the directly senior nodes. 5. Adjusting the security level of all nodes in the role hierarchy. The value of security level of all nodes is divided by the value of the root to be mapped to the range [0, 1] end. There is a role-mapping < A4 , A, B2 , B, f alse > exist between the domains in the figure 2. In a moment, a user in the domain A requests the operation “write” to the object O5 which is in the domain B: < O5 , write >. We set the basic security parameter k in this example as 1. Firstly, according to algorithm 1, We set security level of the directly controlled objects of the deepest leaf nodes B4 , O7 and O8 , as 1. Followed up, we can get that the security level of the directly controlled objects of B2 is 2, and the security level of the directly controlled objects of B1 is 3. By the step 5, we can get the security level of O5 : (k+1) (k+2) = 2 3 = 0.67.
284
Z. Tang et al. A1
U2
B1 U1
A2
A3
U3
A4 B2
U4
A5
O4 A7
A6 U5 O1
O6 B6
B4 O2
B3 O11
O5
B5
O3
O10 O7
Domain A
O8
O9 Domain B
Fig. 2. The inter-operations between domain A and domain B Table 1. The historical inter-operations between domain A and domain B Sequence Subject Object Operation Status 1 U1 O5 read successful 2 U2 O6 write successful 3 U3 O7 read failed 4 U4 O5 execute successful 5 U5 O8 copy failed 6 U1 O9 write successful 7 U1 O10 read successful
The safety factor for all access events in this example, which are denoted as read, copy, write, execute, is set as (0.8, 0.6, 0.4, 0.2) respectively. We suppose that the inter-operation history between domain A and B is as the table 1 shows. There are seven access events which the subject is in domain A and the object is in domain B. And five of them are successful and two failed. By the above method = 0.43. to compute the trust degree between domain A and B, we have: (5−2) 7 So, According to (4), we can get the risk of the above 2-tuple < O5 , write > as: Ri (Os , C(i, j), A) = Os ×(1−C(i, j))×(1−As ) = 0.67×(1−0.43)×(1−0.4) = 0.23 Thus, we can educe the risk rank in the multi-domain environment. In this instance, the risk is divided as 5 levels, which are as {potty, little, general, grave, verygrave}. The mapping from risk values to the risk ranks is as table 2 shows. This table can be configured by the administrators according to the special context. Table 2. The mapping from risk values to the risk ranks Sequence ranks 1 I 2 II 3 III 4 IV 5 V
values description 0 ≤ Ri < 0.2 potty 0.2 ≤ Ri < 0.4 little 0.4 ≤ Ri < 0.6 general 0.6 ≤ Ri < 0.8 grave 0.8 ≤ Ri < 1.0 very grave
Dynamic Access Control Research
285
We can conclude from the table 2 that the above operation < O5 , write >, a user in the domain A requests the operation “write” to the object O5 which is in the domain B, it’s risk rank is II. This means that the operation < O5 , write > has little risk. It may bring the failure for the operation. In the following sections, we will discuss how to avoid the risk through adjusting the privileges of the subjects.
4 4.1
The Risk-Based Dynamic Access-Control Model for Multi-domain The Model of MD-R2 BAC
The traditional security mechanism is generally designed for static network and closed system. In these systems, the authorizations of the users are determinate, and the relationship between user’s privileges and resources are found early. Based this, the protected resource are only be accessed by the authorized users. As these security models are simpler, we can call them traditional security model. But in the multi-domain environment, as the requestor and the provider of the resource can be in the different domains, because there is no absolute trust between these domains, it can not satisfy the all requests of the requestors through the traditional access control mechanism. Further more, the multi-domain environment changes frequently, and the real-time update is also unpredictable, so, the requestor and the provider of the resource may do not know each other. Therefore, the traditional models do not match the multi-domain environment well. This paper proposes a risk-based dynamic access-control model through importing the risk of the event context to the policies of RBAC. The authorization of the traditional is general denoted as 3-tuple: < S, O, A >, where S represents the subject, O represents the object, and A is the set of the actions. If a 3-tuple < s, o, a > exist, that is to say, the subject S can do the operation A on the object O. These 3-tuples are all predefined in the security system, and they are effective of all times. For the privilege constrain to the users, this access control method is passive and negative. RR UA
U
UD
Ro
RP
P
RD
D
DD
Ri
Fig. 3. The model of MD-R2 BAC
286
Z. Tang et al.
The following definitions contain some elements in the literature. The relationship between these definitions is as the figure 3. Definition 5. The variables and the relationships in this model. 1. M D − R2 BAC = (U, Ro, P, D, Ri), where U is the set of the users, Ro is the set of the roles, P is the set of the permissions, D is the set of the domains, and Ri denotes the set of the risk banks. 2. User Assignment. This is a many-to-many relationship between users and roles. It is denoted as U A, U A ⊆ U × Ro. This function can be expressed as assigned − user(ro) = {u ∈ U (u, ro) ∈ U A}. 3. Permission Assignment. This is a one-to-many relationship from roles to permissions. And it is a function which from the roles and risk ranks to the permissions. It is denoted as RP , Ro × Ri → P . The function can be expressed as: assigned − permission(ro : Ro, ri : Ri) → 2P . 4. Role Relation. This relationship contains the hierarchy and inheriting between the roles. We denote the set of the relationship of roles as RR, RR ⊆ R × R. 5. Risk between Domains. This is a mapping from the inter-operation between the domains to the risk rank. It is a function from a special inter-operation event to a certain risk rank. We can denote as DR : D × D × P 0003 → Ri, P 0003 ⊆ P. 6. User Hypotaxis. This is a one-to-many relationship from roles to domains. It is denoted as RD : R → D. Each user and role in the model must belong to a special domain, and a user or role can not be subject to two or more domains synchronously. This constrain is defined as follows: – Constrain 1. Each user in the model must only subordinate to only domain. < u, d1 >∈ U D∩ < u, d2 >∈ U D → d1 = d2 – Constrain 2. Each role in the model must only subordinate to only domain. < r, d1 >∈ RD∩ < r, d2 >∈ RD → d1 = d2 The dynamic distribution of permission in the MD-R2 BAC is mainly embodied in relation of DR, RP. The ration DR can acquire the trust degree of two domains through the inter-operation history. All the more, it can acquire the risk rank of an access event according to the security extent of the access operation and the security level of the operation object. Where, the access event between different domains can also be denoted as 6-tuple: ¡S, O, A, D1 , D2 , ri¿. The meaning of the elements is the same as the authorization item. Hence, in the relation DR, we have P 0003 ⊆ P . RP is a real-time implementation of the dynamic function. It can adjust the authorization to a subject in a domain duly according to the risk rank of the inter-operation. This paper will detail the authorization and the adjustment in the following sections.
Dynamic Access Control Research
4.2
287
The Policy and Mechanism of the Access Control in MD-R2 BAC Model
MD-R2 BAC is an access control model which can be changed with the interoperation history between different domains. It is a dynamic process that the users acquire the permissions through the roles. In this process, system can adjust the subject’s permissions according to the risk rank of its operation. In this way, the access control implementation process can be divided into three steps: privilege distribution, ascertaining the risk rank, and dynamic adjustment of the privilege. Privilege distribution. The privilege distribution includes that the administrator assigns the roles to user and predefine the permissions of the roles. Referred to above, the users’ permissions can be denoted as a 2-tuple :¡ u, ro¿, the set is formalized as ua(u,ro). We use UA(ro) to denote that assigning the role ro to a set of users U A(ro) = {ua(ui, ro) ua(ui, ro) ∈ U A(i = 1, 2, . . . , n)} In the multi-domain environment, the privilege usually performs as the power of the subject to access the objects which in the different domain. The basic authorization item can be formalized as the 6-tuple < s, o, a, d1 , d2 , ri >,which is denoted as atomp(s, o, a, d1 , d2 , ri). It means that a subject s in the domain d1 can do the operation a to the object o which is in the domain d2 , and the risk rank of this operation is ri in the current context. RP (ro) = {atomp(s, o, a, d1 , d2 , ri) atomp(s, o, a, d1 , d2 , ri) ∈ P } Ascertaining the risk rank. The risk rank ri in authorization item is a function of the access history events in the multi-domain environment. We have detail the process of calculation for the risk rank in the third part of this paper. The function which acquires the risk rank of an inter-operation is recorded as risk count(s,o,a,d 1 , d2 ). It returns the risk rank of a subject s in the domain d1 do the operation a to the object o which is in the domain d2 Ri = {ri ri = risk count(s, o, a, d1 , d2 )} Dynamic adjustment of the privilege. The prominent character of MDR2 BAC is that it can adjust the subject’s privilege according to the risk rank of its operation to the objects. We can set a risk threshold RV between two different domains. For the subject which acquire the access permissions to the objects in the other domains through the role-mappings, we can check each authorization items, and revoke the items whose risk rank over the predefined threshold RV. The dynamic adjustment of the privilege mainly reflected in the relation of privilege distribution RP. It is a function which from the roles and risk ranks to the permissions. F (Ro, Ri) → P, P = {atomp1 , atomp2 , . . . , atompn }, this is the initial permission set.
288
Z. Tang et al.
G(Ro, Ri, P1 ) → P2 , P1 ⊆ P, P2 ⊆ P, P2 = P − P1 , G is the revoke function. Return to the example in the third section, suppose that the initial permission set of the role A4 in the domain A to the objects in the domain B is P S(A4 ) = {< O5 , write >, < O6 , read >, < O7 , read >, < O8 , write >, < O9 , read >, we can acquire the risk value of these five authorization items: risk count(A4 , O5 ,write,A,B) = Ri(O s (O5 ),C(A,B),As (write)) = 0.23,the risk rank is II; risk count(A4 , O6 ,read,A,B)= Ri(O s (O6 ),C(A,B),As (read))=0.08, the risk rank is I; risk count(A4 , O7 ,read,A,B)= Ri(O s (O7 ),C(A,B),As (read))=0.04, the risk rank is I; risk count(A4 , O8 ,write,A,B)= Ri(O s (O8 ),C(A,B),As (write))=0.11, the risk rank is I; risk count(A4 , O9 ,read,A,B)= Ri(O s (O9 ),C(A,B),As (read))=0.04, the risk rank is I; If the predefined threshold RV is set as 0.2, base the policy of dynamic adjustment of the privilege, we will revoke the write permission of the subject A4 to the object O5 between domains A and B: P S(A4 ) = {< O6 , read >, < O7 , read >, < O8 , write >, < O9 , read >} Through the privilege’s dynamic adjustment, whether the subject can acquire some privilege lie on the risk rank of the relevant authorization items. The course which the subjects acquire the privilege is a dynamic and frequent process. We can decide on that whether the operation is can be executed base the operations history, the security level of the operation object, and the security extent of the access operation. Hence, the authorization is dynamic which will be adjusted with the time and the hierarchy of the subjects and objects. 4.3
The Security Analyses for the MD-R2 BAC
Comparing with the traditional security model, the contribution of the MDR2 BAC is as follows: 1. It is adapted well to the dynamic change in the multi-domain environment. In the MD-R2 BAC, the change of operations and objects can bring the change of the authorization. Through the risk rank, this model can reflect the change of the operations and the hierarchy of the access objects. Further more, these changes in this model will not affect the special authorizations. Hence, it can be adapted well to the frequently change of the multi-domain environment. 2. The more security MD-R2 BAC first imports the concept of risk to access control model, and the ultimately minimal of a subject is up to the conclusion of the access history. The risky permissions will be revoked in the dynamic adjustment
Dynamic Access Control Research
289
for the authorizations. It will restrict the risky event from the source and advance the success probability of the inter-operations. Through the dynamic adjustment, this model supports these two famous security principles: – The least of privilege. Through the risk control, the privilege with high risk rank will be revoked in time. When the subject accesses the other domain’s objects, it only holds on the relative security privilege. – The separation of duty. Sometimes, there are some sensitive objects can not be accessed by one subjects at the same time, and two different subjects also do not hope access one object simultaneity. These above can be regarded as the high risky events. These events can be identified by the estimate of the risk rank. And the system can revoke some part of the privileges to implement this security principle.
5
Conclusion and Future Work
In the multi-domain environment, the randomicity exist in the share of the information, the validity of the security mechanism, and the demand of the information exchange between users. For the complexity of the multi-domain environment and the ceaseless evolvement of the information secure share, the traditional access control method can not ensure the absolute security for the exchange of data resource. The traditional security mechanism is always designed for static network or closed system. As lacking the dynamic description of the subjects and objects, the traditional security mechanism hardly to adjust the subjects’ privilege base the security status of the system. Through introducing the concept of risk, this paper proposes a dynamic access control model MD-R2 BAC for multi-domain environment based the risk of interoperations. This model can acquire the risk of the authorization items of the subject’s privilege base the operations history, the security level of the operation object, and the security extent of the access operation. And the risky permissions will be revoked in the dynamic adjustment for the authorizations. The security analyses for the MD-R2 BAC indicate that this model can reduce the risk and hidden trouble for the information exchange in the multi-domain environment, and advance the security of the system obviously. In this paper, we only discuss the risk for a single access operation in the multidomain environment. When an attacker who combines multiple low risk operations into a new operation, how to assess the risk for the new multiple operation is a problem being worth paying close attention to. It will be our future works.
References 1. Sonntag, M., Hrmanseder, R.: Mobile agent security based on payment. Operating Systems Review 34(4), 48–55 (2000) 2. Kwok, Y.K., Song, S., Hwang, K.: Selfish grid computing: Game-theoretic modeling and nas performance results cardiff. In: CCGrid-2005. Proceedings of the International Symposium on Cluster Computing and the Grid, Cardiff, UK, May, pp. 9–12 (2005)
290
Z. Tang et al.
3. IEEE (ed.): IEEE Security and Privacy. Economics of Information Security, vol. 3. IEEE Computer Society, Los Alamitos (2005) 4. Grandison, T., Sloman, M.: A Survey of Trust in Internet Applications. IEEE Communications Surveys 3(4), 2–16 (2000) 5. Mayer, R.C., Davis, J.H., Schoorman, F.D.: An Integrative Model of Organizational Trust. Academy of Management Review (20), 75–91 (1995) 6. Jarvenpaa, S.L., Leidner, D.E.: Communication and trust in global virtual teams. Organization Science 10(6), 791–815 (1999) 7. Sandhu, R., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role Based Access Control Models. Computer 29(2) (1996) 8. Moyer, M.J., Covington, M.J., Ahamad, M.: Generalized role-based access control for securing future applications. In: NISSC 2000. 23rd National Information Systems Security Conference, Baltimore, Md, USA (October 2000) 9. Zhang, G., Parashar, M.: Context-Aware Dynamic Access Control for Pervasive Applications. In: Proceedings of the Communication Networks and Distributed Systems Modeling and Simulation Conference (CNDS 2004), Western MultiConference (WMC), San Diego, CA, USA (January 2004) 10. Dimmock, N., Belokosztolszki, A., Eyers, D., Bacon, J., Moody, K.: Using Trust and Risk in Role-Based Access Control Policies. In: Proceedings of Symposium on Access Control Models and Technologies (2004) 11. Grandison, T., Sloman, M.: A Survey of Trust in Internet Applications. IEEE Communications Surveys 3(4), 2–16 (2000) 12. Blaze, M., Feigenbaum, J., Lacy, J.: Decentralized trust management. In: Proc. IEEE Conference on Security and Privacy. AT&T (May 1996) 13. Li, N., Mitchell, J.C., Winsborough, W.H.: Design of a role-based trust management framework. In: 2002 IEEE Symposium on Security and Privacy, pp. 114–131. IEEE, Los Alamitos (2002) 14. Teh-Ming, W., Fidelis, Y.: A policy-driven trust management framework. In: Nixon, P., Terzis, S. (eds.) iTrust 2003. LNCS, vol. 2692, Springer, Heidelberg (2003) 15. Dimmock, N., Belokosztolszki, A., Eyers, D., et al.: Using Trust and Risk in Role based Access Control Policies. In: SACMAT 2004, New York, USA, June 2-4, 2004 (2004)
A Compositional Multiple Policies Operating System Security Model Lei Xia, Wei Huang, and Hao Huang State Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University, 22 Hankou Road, Nanjing 210093, China [email protected], [email protected], [email protected]
Abstract. Multilevel security policies aim at only confidentiality assurance, with less consideration on integrity assurance and weakness in expressing channel control policies. Besides, the trusted subjects it introduces to handle the information flow “downgrade” have many security flaws. Moreover, increasing diversity of the computing environments results in various security requirements. However, current mainstream security models are aiming at only one or few requirements of them each. The Multi-Policy Views Security Model is presented, which is based on the MLS model, combining the domain and role attributes to the model, to enforce the expression power in channel control policies, make permission management more fine-grained and enhance the ability of confining the permission of the trusted subjects. Moreover, MPVSM has integrated the properties and functions of MLS, Domain-Type and Role Based models into one unified model. It is able to enforce multi-policy views in operating system in a flexible way. Keywords: security model, multiple policy views, integrity assurance, confidentiality assurance, least privilege, separation of duties.
1 Introduction Multilevel Security Model (MLS) [1] is one of the most widely used security model in various system. MLS aims at confidentiality assurance of the information, preventing the unauthorized leakage of the data in high security level. However, it rarely considers the integrity assurance, which is also a very critical security requirement [2,7,11]. Biba model is a mathematical dual of BLP, intending to protect integrity of information. However, they are both based on lattice mechanism, and information policies based on the lattice are transitive. Therefore, MLS models are weak in expressing channel control policies [22]. A firewall’s control policy is a typical channel control policy. As an example in Figure 1.1, Information is allowed to flow from the Outside component to the Inside component via the Access Control, but is forbidden to do so directly. Such policies are hardly expressed in the MLS models. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 291–302, 2007. © Springer-Verlag Berlin Heidelberg 2007
292
L. Xia, W. Huang, and H. Huang
Firewall
Intranet
Inside
Access Control
Outside
Internet
Fig. 1. Information channel control in a firewall
In addition, in MLS models, information is not allowed to flow “downward”, such as from high confidentiality level to lower, or from lower integrity level to higher. However, in many situations, information flow “downward” is needed. Such an example of information flow from lower integrity level to the higher is the system call between user process and operating system. Data at a higher integrity level is more accurate and/or reliable than data at a lower integrity level. And the integrity level of user’s data is usually lower than the operating system’s data. Therefore, according to multilevel policies, user’s data is not allowed to flow to the operating system data objects. However, in actual computer system, though user applications are not permitted to violate the integrity of operating system data, they should be given appropriate ways to pass the information or parameters to operating system. To handle these problems, many MLS systems (such as HP-UX/CMW [8]) introduce some special trusted subjects outside the TCB. These special subjects are given privileges to bypass the MLS access control mechanisms. For they are not controlled by the access control mechanisms, they have almost all of privileges, which is far more than what they need to do they jobs. It is obviously a violation to the Principle of Least Privileges. These subjects turn to be the potential targets for malicious attacks. Once they are compromised, they can bring huge damages to the system. Moreover, various security requirements are coming up with the sharply increased diversity and complexity of the computing environments. To satisfy these security requirements, a variety of security models were proposed in last twenty years. Widely-used security policies in current mainstream systems include multi-level military security model (Bell-LaPadula model, BLP) [1] and its variants (Biba [6], Dion model), Domain and Type Enforcement (DTE) [9], Role-based access control (RBAC) [14], and etc. Each of these models aims mainly at one or several security requirements, such as BLP aiming at the confidentiality of the information, Biba aiming at data integrity assurance, DTE aiming at confining the information flow channels, etc. Previous trusted operating system usually enforced only one kind of mandatory access control model, for instance, Multics[3] implemented only BLP model in it. However, as mentioned above, the security goals in different applications are various. The different security requirements of applications result in different security models needed for them. How operating system to support this kind of multiple security model views--the access control model different applications can perceive in the system is different. Recently, as a policy neutral security model, RBAC provides a valuable level of permission abstraction. However, using RBAC to simulate multi-level security level or
A Compositional Multiple Policies Operating System Security Model
293
discretionary access control models [12] is over complex and therefore unpractical in real-world operating system. In this paper, a Compositional Multiple Policy Security Model (MPVSM) is presented. MPVSM is a hybrid security model, which is based on Multi-level Security models. It combines confidentiality and integrity lattices into a unified security lattice for confidentiality and integrity assurance. It then divides the subjects and objects in the same security level into different domains and types, using access control mechanisms in DTE to make the permission assignment and management more fine-grained and flexible, meantime enforce the separation of duties between subjects in the same security level. In addition, using the thought of RBAC, role is added in MPVSM. Roles are assigned the extra permissions, which are independent of MLS and Domain-Type parts of the model. MPVSM makes use of the flexible permission assignment and revocation mechanisms in RBAC to confine the permissions of those special “trusted subjects”, provides a way to make them do they job out of control range of the MLS and Domain-type access control parts, but meanwhile prevent them from too powerful to be potential security holes of the system. MPVSM has integrated the properties and functions of Multiple Level Security, Domain-Type and Role Based models into one unified model. By combining the elements and attributes of DTE and RBAC to the MLS model, MPVSM avoids the drawbacks of MLS. MPVSM is able to enforce channel control and assured pipelines policies, with providing fine-grained permissions management. In addition, MPVSM owns an enhanced ability of policy expression. It can ensure the enforcement of least privilege to these special “trusted subjects” in the MPVSM model. Moreover, MPVSM provides a framework to enforce multiple policy views in operating system. It can not only enforce the equivalent functions of these three kinds of models independently in the system, but also can enforce multi-policy views between different applications in system. The remainder of the paper is organized as follows. Section 2 describes the MPVSM formally. Section 3 gives the example policy configurations in MPVSM. Section 4 is the related works. And section 5 is the conclusion.
2 Multiple Policy Security Model 2.1 Overview The architecture of the MPVSM is shown in figure 2. MPVSM comprises of elements, relations and mappings. A user in the framework is a system user. A role is a job function or job title within some associated semantics regarding the authority. Subjects are active entities in the system, usually processes or transactions. Objects are data objects as well as resource objects. Domain is a control access attribute associated with each subject. And type is the other control attribute associated with objects. Two global matrixes are defined to represent allowed access or interaction modes between domains and types or domains and domains respectively. Permission is an approval of a particular mode of access to object or interaction to subject. Security label is a 2-tuple, containing a confidentiality label and an integrity label.
294
L. Xia, W. Huang, and H. Huang
There are several relations and mappings between elements. The relation between users and roles are defined in user-role assignment relation. The user-subject relation gives relation between subjects and users, while subject-role mapping figures out a subject’s current running role. Permissions in system can be authorized to roles. Roles’ authorized permissions are given in the role-permission authorization relation. Each role in system can have many authorized domains. Role-domain authorization relation gives the authorized domains of each role. Each subject has only one running domain, which is given in subject-domain mapping. Besides, each subject has a security label. The security labels are assigned to roles, and subject’s security label is determined by the label of its running role. Each object has a security attribute which includes the type and security label of that object. User
Subject
Role Label
Subject perms Role perms
Domain
Perms
Domain perms
Object Type
Label
MLS perms
Fig. 2. The MPVSM
The final permissions that a subject gets are based on three kinds of permissions corresponding to that subject: MLS permissions, Domain permissions and Role-based permissions. MLS Permissions are coarse-grained base permissions, indicating the subject has the read-related permission or write-related permission. Domain permissions are the fine-grained permissions based on the MLS Permissions. Role-based permissions is independent from MLS permissions and domain permissions. Correspond to the three kinds of permissions, MPVSM model contains three access control views: 1)Multi-level Security Access Control. The MLS permissions given to the subject are based on the security level of the subject and the target object. 2)Domain Access Control. Subjects run in different domains. A subject’s Domain permissions are based on its running domain and the target objects’ types. 3)Role-Based Access Control. The subject has the permissions of its running role as its Role-based permissions. In addition, MPVSM model is also an extensible security model, by adding other attributes to the role and object, besides the label and domain or type attributes of current role and object, more functions or properties can be added. 2.2 Formal Definitions The following definitions formalize the discussion above: Definition 2.1. Elements Sets • Users: U • Subjects: S
A Compositional Multiple Policies Operating System Security Model
295
• • • • • • • •
Objects: O Roles: R Domains: D Types: T Confidentiality Labels: C Integrity Labels: I Security Labels: SL ⊆ C×I Access modes: P, containing of two disjointed subsets: Read-related Modes RP={read(r), execute(e), getattr(g) .. }; Write-related Modes WP={ write(w), append(a), create(c), delete(d), setattr(s) .. }; P=RP WP. • Domain transfer operation: transfer(t) transfer denotes the subject can transfer from one domain to another domain. • Role Permissions CAP ⊆ P×O, (p, o) CAP denotes the permission to access o in mode p.
∪
∈
Definition 2.2. US ⊆ U×S,user-subject relation. More than one subject can run on behalf of one user at the same time. But each subject can only run on behalf of one user, called its running user. • user: S→U, mapping from subject to its running user. user(s) =u if and only if u U (u, s) US.
∧
∈
∈
Definition 2.3. UA ⊆ U×R, user-role assignment relation. Each user can be assigned many roles and each role can be assigned to many users. • UR: U→2R, mapping from user to the set of roles assigned to that user. UR(u) ={r R (u, r) UA}. • SR: S→R, subject-role mapping, from the subject to its running role. Each subject has a running user and running role at anytime, and the role must have been assigned to that user: SR(s) UR(user(s)).
∈
∈
∈
Definition 2.4. role’s security label • RL: R→SL, mapping from role to its security label. • Ssl: S→SL, mapping from subject to its security label. Subject’s security label is the same as its running role’s security label: Ssl (s)=RL(SR(s)). Definition 2.5. RD ⊆ R×D, role-domain authorization relation. Each role has many authorized domains and each domain can be authorized to many roles. • RDom: R→2D, mapping from role to its authorized domains set. RDom(r) ={d D (r, d) RD}. • SDom: S→D, mapping from subject to its running domain. Each subject is running in only one domain at anytime, and the domain is authorized to that subject’s running role: SDom(s) RDom (SR(s)).
∈
∈
∈
Definition 2.6. object’s security attribute • OT: O→T, mapping from object to its type. • OL: O→SL, mapping from object to its security label.
∈
Definition 2.7. RCAP ⊆ R×CAP, role-permission authorization relation. (r1,cap) RCAP denotes that role r1 has the role permission cap.
296
L. Xia, W. Huang, and H. Huang
• Rolecap: R→2CAP, mapping from role to its authorized Role permissions set. Rolecap(r)={cap (r,cap) RCAP}.
∈
Definition 2.8. Two control matrixes • DTM: D×T→2P, domain-type access control matrix. p DTM(d, t) denotes that the subjects in domain d can access objects with type t in mode p. • DDI: D×D→{Φ,{transfer}}, domain-domain interaction control matrix. transfer DDI(d1, d2) denotes that subjects in domain d1 can transfer into domain d2.
∈
∈
∈
Definition 2.9. Multiply Levels Security rule: MLS_rule: SL×SL→2P, a MLS_rule (ls, lo) implies that subjects with security label ls can access objects with security label lo in mode a. This rule combines the BLP confidentiality and Biba integrity lattices. Let ls=(cs, is), lo=(co, io): • • • •
≥co, permit all read-related operations, that is: RP ⊆ MLS_rule(ls, lo). <co, deny all read-related operations, that is: ∀ p∈RP, p∉ MLS_rule (ls, lo). ≥io, permit all write-related operations, that is: WP ⊆ MLS_rule(ls, lo). <io, deny all write-related operations, that is: ∀ p∈WP, p∉ MLS_rule (ls, lo).
If cs If cs If is If is
2.3 Permission Decision Definition 2.10 • MLS permission (MLP): a subject’s MLP on an object is determined as follow: mlp(s, o)={(o, p) p MLS_rule (Ssl(s), OL(o))} z Domain permission (DP): a subject’s DP on an object is determined as follow: dp(s, o)={(o, p) p DTM(SDom(s), OT(o)) z Role permission (RP): a subject’s RP on an object is determined as follow: rp(s, o)={(o, p) (o, p) Rolecap(SR(s))} A subject’s Final permissions on an object is determined as: fp(s, o)=(mlp(s, o) ∩dp(s, o)) rp(s, o).
∈
∈
∪
∈
3 Examples of Policy Configuration 3.1 Trusted Subjects’ Permission Confinement We take the information interaction between user process and operating system as an example on the information flow from lower integrity level to higher integrity level. As shown in Figure 3. User data is in lower integrity level, and operating system data in higher integrity level. In order to satisfy the needs of system calls, User process is permitted to write to the buffer data space of the operating system, but no permission to the other data object of the OS. Similarly, operating system writes the data to the buffer object of the user process. To enforce the policy described above, each process is assigned a security attribute {role, domain}, denoting the process’s running role and running domain. And each data object is assigned a security attribute {(c, i), t}, denoting the object’s confidentiality level c, integrity level i and type t, as shown in Figure 3. And Rolecap(usr_r)={(w, kerbuffer)}, role usr_r has write permission to the kerbuffer object, its security label is (0,1),
A Compositional Multiple Policies Operating System Security Model
297
User process {usr_r, usr_d} {(0,1),usr_t
{(0,1), usrbuf_t}
usrbuffer
usrprivate
Kernel process {ker_r, ker_d} {(0,2),kerbuf_t}
kerbuffer
kerprivate
{(0,2),ker_t}
Fig. 3. Information interaction between user process and operating system Table 1. DTM ker_t ker_d usr_d
r,w
Kerbuf_t
usr_t
r
usrbuf_t w
r,w
r
and its authorized domains set is {usr_d}. Rolecap(ker_r)= Φ, ker_r has no role permission, its security label is (0,2), and its authorized domains set is {ker_d}. The DTM and DDI between the domains and types are shown in Table 3.1. We isolate the operating system data and user data from each other by dividing them into different integrity level. The usrbuffer and usrprivate data objects which are in the same integrity level are divided into different types, therefore User process can have different fine-grained permissions to these two objects. According to the MLS policy, User process has no write permission to the kerbuffer objects for its integrity level is lower than the object. However, this write permission is necessary for getting job done. So, we assign write permission on object kerbuffer to the role usr_r, this permission is independent of MLS and Domain permissions. In this way, the function of system call is achieved without giving too much permission to the User process to bring potential security flaws to system. 3.2 Channel Control We design a simplified firewall system to demonstrate the use of MPVSM in configuring channel control policies. The firewall is shown in Figure 4. The security policy of the firewall is that all information flow from outside network to inside network or in reverse direction must be checked by the access control component. It can be described as follow: information is only allowed to flow from the Outside to the Inside or in reverse direction via the Access Control. It can not be flowed directly between them. Besides, all components are permitted to read the Config without modifying it. And all components can append information to the Log, but can not read it. Our configurations to enforce this policy are shown in Figure 3.2. The DTM and DDI between domains and types are shown in Table 3.2. And Rolecap(fw_r) =Φ, role
298
L. Xia, W. Huang, and H. Huang
Firewall
Log {(3,0),con_t} {f_r, ac_d} Access Control
{(1,1),out_t}
{(1,1),in_t
Outside
Inside
Intranet
{fw r, in d}
Internet
{fw_r, out_d}
Config
{(0,3),con_t}
Fig. 4. Policy configuration of the Firewall Table 2. The DTM and DDI in_t in_d
con_t
r,w
r,a
r,w
r,a
r,w
out_d ac_d
out_t
r,w
in_d
out_d
ac_d
r,a
fw_r has no Role permissions. The security label of the fw_r is (1,1), and its authorized domains are {ac_d, in_d, out_d}, there is no transfer permission between any two of these three domains. We upgraded the confidentiality level of the Log to make it unreadable to the components, upgraded the integrity level of the Config to make it unmodifiable by components. Then we divided the subjects of the same security level to different domains, and objects of the same security level to different types. By controlling the fine-grained permissions between the domains to types, information channel control policy between the inside and outside network is enforced. 3.3 Enforcing Multiple Security Policies Through different configuring ways, Multi-Level security model, DTE and RBAC can be enforced separately in the MPVSM, and multi-policy views between different user groups can be enforced too. 3.3.1 Enforcing Multi-level Security model The way configuring MPVSM to enforce Multi-Level Security model, which based on both confidentiality and integrity lattices, is described as following:
A Compositional Multiple Policies Operating System Security Model
299
1. R = SL , number of roles in the system is the same as the number of the security labels. Each role corresponds to one security label. 2. D={gen_d}, T={gen_t}, there are only one domain and one type in the system. RD={(r, gen_d) r R}, indicates that all roles’ authorized domain is gen_d. The type of all objects is gen_t: OT={(o, gen_t) o O}. 3. DTM={(d, t, p) d D, t T, p P}, which indicates domain gen_d have all domain permissions to type gen_t. 4. Rolecap(r:R)=Φ, no role permission is authorized to every role.
∈ ∈ ∈ ∈
∈
3.3.2 Enforcing DTE 1. R={gen_r}, only one role in system. UA={(u, gen_r) u U}, role gen_r is assigned to every users. 2. RD={(gen_r, d) d D}, indicates that all domains in system are authorized to the role gen_r. 3. SL={(Only_C,Only_I)}, only one security label in the system, therefore: RL= {(r, Only_C, Only_I)) r R}. 4. Rolecap(r:R)= Φ.
∈
∈
∈
3.3.3 Enforcing RBAC 1. D={gen_d}, T={gen_t}, only one domain and one type in system. RD={(r, gen_d) r R}, gen_d is authorized to every role in system. The type of all objects is gen_t, OT={(o, gen_t) o O}. 2. DTM(gen_d, gen_t)=Φ, denotes subjects in gen_d domain have no domain permissions to objects in type gen_t. 3. SL={(Only_C, Only_I)}, only one security label. RL={(r,(Only_C,Only_I)) r R}.
∈
∈
∈
3.3.4 Enforcing Multiple Model Views Assume all users in the system can be divided into three groups: Grpa, Grpb and Grpc. Now we may hope that users in each group can perceive different access control model views. For instance, users in Grpa think that the security model enforced in system is MLS, users in Grpb think that is RBAC and users in Grpc think that is DTE. The configuring method that enforces this multi-model views in one system is given as below.
∪
∪
1. U=Grpa Grpb Grpc, the three sets are disjointed each other. 2. R= mls_rs rbac_rs {dte_r}. mts_rs is the roles set corresponding to MLS model. rbac_rs is the roles set corresponding to RBAC model. And dte_r is the role corresponding to DTE model. 3. D= {mls_d} {rbac_d} dte_ds. 4. (u, r) UA (u, r’) ∉ UA, where u Grpa, r mls_rs, r’ ∉ mls_rs, the roles in mls_rs are only permitted to be assigned to users in Grpa. (u, r) UA (u, r’) ∉ UA, where u Grpb, r rbac_rs, r’ ∉ rbac_rs, the roles in rbac_rs can only be assigned to users in Grpb. Similarly, (u, dte_r) UA (u, r) ∉ UA, where u Grpc, r≠dte_r, every user in Grpc is assigned the only role dte_r. 5. mls_rs = SL , number of roles in set mls_rs is the same as the number of security labels in system. Each role in mls_rs corresponds to one security label. MLS_rule(Ssl(r), tsl)=M, r rbac_rs, tsl SL, have all of possible MLS permis-
∪
∪
∪
∪
∈ ∧
∈
∈
∈
∈
∈
∈ ∧
∈ ∧
∈
∈
300
L. Xia, W. Huang, and H. Huang
∈
6.
7.
sions to other subjects or objects. MLS_rule(Ssl(dte_r), tsl)=M, tsl SL, role dte_r’s MLS permissions to other subjects or objects include all of possible permissions too. The security label of the roles in rbac_rs is the lowest level label of the system, that is: ∀ r rbac_rs, RL(r)=(cs, is), ∀ c C, cs c and ∀ i I, is i. the security label of dte_r is the highese level label of the system, that is: RL(dte_r)=(cs, is), ∀ c C, cs c and ∀ i I, is i. (r, mls_d) RD (r,d) ∉ RD, where r mls_rs, d≠mls_d, every roles in mls_rs is authorized the only domain mls_d. (dte_r, d) RD (r’, d) ∉ RD, where r’≠dte_r, d dte_ds, all domains in dte_ds are authorized to role dte_r. Simliarly, (r,rbac_d) RD (r,d) ∉ RD, where r rbac_rs, d≠rbac_d, every role in rbac_rs is authorized the only domain rbac_d. (mls_d,t,p) DTM,t T, p P. DDI(mls_d, d)=Φ, d D, subjects in domain mls_d can not transfer to other domains. (rbac_d, t, p) DTM, t T, p P, the subjects in domain rbac_d have all domain permission to all types’ objects in system. DDI(rbac_d, d)=Φ, d D, subjects in domain rbac_d can not transfer to other domains. For every r mls_rs {dte_r} and c CAP, (r, c) ∉ RCAP, there is no role permission authorized to roles in set mls_rs and the role dte_r.
∈
≥
∈ ∈ ∧
8. 9.
10.
∈
∈ ≥ ∈ ∧
≤
∈
∈
∈
∈ ∈ ∈ ∈ ∈
∈
∪
∈ ≤
∈
∈ ∧
∈
∈
∈
4 The Related Works Bell-LaPadula [1] (BLP) model mainly emphasizes the protection of confidentiality. It is able to limit flow of information and unauthorized information leakage. However, it does not care about the integrity, which is also important [2,7,11]. Besides, BLP is weak in channel control of information flow [22]. Biba Integrity Model [6] is the mathematical dual of BLP, intending to protect the integrity in system. Type enforcement is a table-oriented mandatory access control policy for confining applications and restricting information flows. DTE [9] is an enhanced version of type enforcement designed to provide needed simplicity and compatibility. Role-based access control [14] provides a valuable level of abstraction to promote security administration at a business level. The Flask [10] security architecture emphasizes diverse security policies support. However, it applies only MAC to the Fluke Microkernel. It provides the mechanisms for diverse policies without giving how to enforce multiple policies in system. One of the earliest MAC mechanisms in operating system is Lattices [1, 20]. For instance, LOMAC [13] enforces Biba integrity. CMW [8] can dynamically relabel the current object for increased flexibility. Recently, Asbestos [17] provides labeling and isolation mechanisms that can support applications to express a wide range of policies and make MAC more practical. KernelSec [18] aims at improving the effectiveness of the authorization model and the security policies that can be implemented. In capability-based confinement systems, KeyKOS [21] achieved military-grade security by isolating processed into compartments and interposing reference monitors to control the use of capabilities. EROS [19] later successful realized this principles on
A Compositional Multiple Policies Operating System Security Model
301
the modern hardware. And the Coyotes kernel [5] mainly explores use of software verification techniques to achieve higher confidence in the correctness and security of the kernel. Mandatory access control can also be achieved with unmodified traditional operating system through virtual machines [16, 4].
5 Conclusions The Compositional Multiple Policy Security Model is presented, which is based on the MLS model, combining the domain and type attributes to the model, to eliminate the limitations of MLS models. It has enforced expression power in channel control policies, and made permission management more fine-grained and enhanced the ability of confining the permission of the trusted subjects. MPVSM is also able to enforce multiple policy views in operating system in a flexible way.
References 1. Bell, D., La Padula, L.: Secure Computer Systems: Mathematical Foundations. Technical Report MTR-2547, vol. I, MITRE Corporation (1975) 2. Amoroso, E., Nguyen, T., Weiss, J., et al.: Towards an Approach to Measuring Software Trust. In: 1991 IEEE Symposium on Research in Security and Privacy, pp. 198–218 (1991) 3. Organick, E.: The MULTICS System: An Examination of Its Structure. MIT Press, Cambridge (1972) 4. Karger, P.A., Zurko, M.E., Bonin, D.W., et al.: A VMM security kernel for the VAX architecture. In: 1990 IEEE Symposium on Security and Privacy, pp. 2–19 (1990) 5. Shapiro, J., Doerrie, M.S., Northup, E., et al.: Towards a Verified, General-Purpose Operating System Kernel. In: 1st NICTA Workshop on Operating System Verification (2004) 6. Biba, K.: Integrity Considerations for Secure Computer Systems. Technical Report MTR-3153, MITRE Corporation (1977) 7. Eswaran, K., Chamberlin, D.: Functional Specifications of Subsystem for Database Integrity. In: The International Conference on Very Large Data Bases (1975) 8. Berger, J.L., Picciotto, J., Woodward, J.P.L., Cummings, P.T.: Compartmented mode workstation: Prototype highlights. IEEE Transactions on Software Engineering, Special Section on Security and Privacy 16, 608–618 (1990) 9. Badger, L., Sterne, D.F., Sherman, D.L., et al.: A Domain and Type Enforcement UNIX Prototype. In: 5th USENIX UNIX Security Symposium (1995) 10. Spencer, R., Smalley, S., Hibler, M., et al.: The Flask Security Architecture: System Support for Diverse Security Policies. In: 8th USENIX Security Symposium, pp. 123–139 (1999) 11. Lipner, S.: Non-Discretionary Controls for Commercial Applications. In: 1982 Symposium on Privacy and Security (1982) 12. Osborn, S., Sandhu, R., Munawer, Q.: Configuring Role-based Access Control to Enforce Mandatory and Discretionary Access Control Policies. ACM Transactions on Information and System Security 3, 85–106 (2000) 13. Fraser, T.: LOMAC–low water-mark mandatory access control for Linux. In: 9th USENIX Security Symposium (1999)
302
L. Xia, W. Huang, and H. Huang
14. Sandhu, R., Coyne, E., Feinstein, H., Youman, C.: Role-Based Access Control. IEEE Computer 29 (1996) 15. Loscocco, P., Smalley, S.: Meeting critical security objectives with security-enhanced linux. In: Ottawa Linux Symposium 2001 (2001) 16. Goldberg, R.P.: Architecture of virtual machines. In: AFIPS National Computer Conference, vol. 42, pp. 309–318 (1973) 17. Efstathopoulos, P., Krohn, M., VanDeBogart, S., et al.: Labels and Event Processes in the Asbestos Operating System. In: 20th Symposium on Operating Systems Principles (2005) 18. Radhakrishnan, M., Solworth, J.A.: Application Support in the Operating System Kernel. In: ACM Symposium on Information, Computer and Communications Security (2006) 19. Shapiro, J.S., Smith, J.M., Farber, D.J.: EROS: A Fast Capability System. In: 17th ACM symposium on Operating systems principles (1999) 20. Saltzer, J.H., Schroeder, M.D.: The protection of information in computer system. Proceedings of the IEEE 63, 1278–1308 (1975) 21. Key Logic. The KeyKOS/KeySAFE System Design (1989), http://www.agorics.com/ Library/KeyKos/ keysafe/Keysafe.html 22. Rushby, J.: Noninterference, Transitivity, and Channel-Control Security Policies. Technical Report CSL-92-02, Computer Science Lab, SRI International (1992)
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones Colin D. Walter Comodo Research Laboratory 7 Campus Road, Bradford, BD7 1HR, UK [email protected]
Abstract. Side channel leakage from smart cards has been of concern since their inception and counter-measures are routinely employed. So a number of standard and reasonable assumptions are made here regarding an implementation of RSA in a cryptographic token which may be subjected to non-invasive side-channel cryptanalysis. These include blinding the re-usable secret key, input whitening, and using an exponentiation algorithm whose operation sequence partially obscures the key. The working hypothesis is that there is limited side channel leakage which only distinguishes very imprecisely between squarings and multiplications. For this typical situation, a method is described for recovering the private exponent, and, realistically, it does not require an excessive number of traces. It just requires the modulus to be public and the public exponent not to be too large. The attack is computationally feasible unless parameters are appropriately adjusted. It reveals that longer keys are much more vulnerable than shorter ones unless blinding is proportional to key length. A further key conclusion is that designers must assume that the information theoretic level of leakage from smart cards can be transformed into usable key information by adversaries whatever counter-measures are put in place. Keywords: Side channel leakage, power analysis, SPA, DPA, RSA.
1
Introduction
Side channel leakage of secret key information from cryptographic devices has been known publicly for a number of years [1], and very widely since the work of Kocher [6,7]. In the case of RSA, the main software counter-measures to this have included message whitening, key blinding and more complex exponentiation algorithms. These, therefore, form the main assumptions here. In the past, there were no obvious ways of extracting weak leaked information from this and using it to recover the secret key. Either the leaked information had to distinguish clearly between squarings and multiplications for individual uses of the key [3] or, with less precise leakage, the same key had to be re-used many times in an unblinded state so that the leakage could be averaged to reduce noise [6,2,13]. S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 303–316, 2007. c Springer-Verlag Berlin Heidelberg 2007 0002
304
C.D. Walter
However, from the information-theoretic standpoint, it is clear that there can be enough data for the key to be recovered when blinding is used but side channel leakage is imprecise. Here a means for obtaining the key is given for that situation, developed from the case of perfect, but partial, side channel information described by Fouque et al. [3]. One of the main contributions here is a metric for evaluating choices and enabling the best to be investigated first. The first objective is to determine the blinding factor for several dozen cases. This is done by brute force: testing every possible value until one is found which would provide a trace that matches the measured trace sufficiently well under a suitable metric. The analysis is complicated by an unknown factor k equal to the size of the public exponent E. That factor must also be determined from the side channel leakage in the same way, and therefore affects the computational feasibility of the method if E is large. However, there is no obvious way to avoid the exhaustive search. Once the blinding factors are determined for as many traces as are needed, the second objective is to determine the unblinded private exponent. Its bits are guessed from most to least significant by looking at both possible values and selecting the one which matches the observed leakage better. Incorrect choices are quickly noticed, and corrected by back-tracking and lookahead. This phase of the attack is less computationally intensive than the first, but it requires more traces when the leakage is weaker – a number inversely proportional to the strength of leakage. The adversary makes use of certain properties of the exponentiation algorithm which lead to the leakage. The standard 4-ary sliding windows [4] considered here has a pattern of squarings and multiplications which contains information about the bit pattern of the exponent. The method applies equally well to any other algorithm with a variable pattern of operations where the variation is derived from a local property of the secret key, such as bit values. Finally, the complexity of the attack is considered. There is low space complexity and the attack is highly, and easily, parallelisable to make full use of computing resources. The total time complexity is of order which is the product of the public key, the maximum blinding factor and a measure of the unreliability of the side-channel leakage. Thus, it appears to be computationally feasible to extract the key in many normal circumstances. A significant conclusion is that, for a fixed amount of blinding, longer keys are less secure because blinding factors are determined more accurately. This means that blinding should be increased in proportion to key length in order to thwart the attack. The organisation of the paper is as follows. The main assumptions, the leakage model, pre-requisite notation and background algorithms are covered in sections §2 to §5. Phase 1 of the attack, during which the blinding factors are recovered, is treated in §6. Phase 2 of the attack, namely the recovery of the secret key, is described in §7. The computational cost is reviewed in §8 and wide-ranging conclusions are drawn in §9.
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
2
305
Notation
The n-bit RSA modulus N and public exponent E are assumed to be known by the attacker. His aim is to recover the private key D which is re-used a number of times but only in the blinded form Di = D+ri φ(N ) where ri < R is a small random number (typically up to 32 bits) and φ(N ) is unknown. The modulus is a product of two primes N = P Q which must be of similar magnitude. For convenience, we assume P and Q have the 0002bits. √ same number of Then, without loss of generality, P < Q < 2P so that 2 N < P +Q < 3 N/2 and φ(N ) = N −(P +Q)+1 is bounded by 0002 √ (1) N − 3 N/2 + 1 < φ(N ) < N − 2 N + 1 √ This interval has length less than 18 N , so that more than half of the most significant bits of φ(N ) are known by the attacker from those of N . The exponents D and E are related by D×E = 1+kφ(N )
(2)
for some k. Without loss of generality, let D be the smallest non-negative solution to this congruence, so that D < φ(N ) and k < E. When key D is re-used many times, blinding factors are normally added to produce the randomly different exponents which are actually used for decryption or signing [6]: Di = D+ri φ(N )
(3)
where ri is a random number, usually of 16 to 32 bits. Thus, Di =
1 + (k+ri E)φ(N ) E
(4)
Let R be an upper bound on such ri . Then the coefficient k+ri E of φ(N ) is, in effect, a random number in the range 1 to RE. (So it is irrelevant whether k and D were chosen minimally in equation (2)). In equation (4) the adversary is initially only interested in the most significant half of the bits. He ignores the 1 and approximates φ(N )/E by computing N/E. By the earlier remarks this gives him at least the top n/2 bits of φ(N )/E. So, Di ≈ (k+ri E)N/E
(5)
The attacker now has to generate each of the RE possible values of the random coefficient of N/E in order to obtain a set containing an approximation to the value of Di used in the exponentiation which he has observed.
3
The Exponentiation
For convenience we assume that the exponentiation algorithm is 4-ary sliding windows using the re-coding in Fig. 1 [4,5]. This uses a window of 1 bit width
306
C.D. Walter Input: Binary D = (bn−1 ..b2 b1 b0 )2 Output: Recoding D = (dm−1 ..d2 d1 d0 ) i ← 0 ; m ← 0 ; While i < n do If bi = 0 then Begin dm ← 0 ; i ← i+1 ; m ← m+1 ; End else Begin dm ← 2bi+1 + 1 ; i ← i+2 ; m ← m+1 ; End Fig. 1. Quaternary Sliding Windows Recoding
when there is a digit zero in the recoded exponent, and otherwise a window of 2 bits width, for which the digit is 1 or 3. Although this does not provide the same protection against side channel cryptanalysis as the square-and-always multiply algorithm, it is more time efficient even than the usual square-andmultiply algorithm and also creates some difficulty for an attacker who may have to distinguish whether the multiplications pertain to digit 1 or digit 3. This algorithm, or its fixed-width equivalent, is typical of a smart card because of its speed and low storage overhead: only the first and third powers of the input message need storing for the exponentiation. Both algorithms generate a pattern of squarings and multiplications which is related to occurrences of the zero digit. This is the property that can be exploited here by an attacker.
4
The Leakage Model
With expected counter-measures in place it is unrealistic to assume that every long integer multiplicative operation in an RSA exponentiation can be identified and distinguished as a squaring or not. However, some imperfect deductions may be possible from a side channel trace, particularly in contactless cards where severe resource limitations and an explicit aerial limit the scope and effectiveness of any counter-measures. The following two leakage scenarios are likely in practice. Others are certainly possible. First, because the conditional subtraction in a Montgomery modular multiplication ([8], see Fig. 2) consumes a number of extra clock cycles, there is a possibility that it may be observed in a side channel trace via the longer time for the operation. The slightly different frequencies of the subtraction for squares and multiplies mean that each occurrence or absence of the subtraction makes
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
307
Input: A and B such that 0 ≤ A, B < N < r n and N prime to r. Output: C = ABr −n mod N C ← 0 ; For i ← 0 to n-1 do Begin qi ← -(c0 +ai b0 )n0 −1 mod r ; C ← (C+ai B+qi N) div r ; End ; { Assertion: Cr n ≡ A×B mod N and ABr −n ≤ C < N +ABr −n } If C ≥ N then C ← C-N ; Fig. 2. Montgomery’s Modular Multiplication Algorithm (MMM)
a square or multiplication marginally more likely [13]. As previous attacks have been unable to use this information in the presence of exponent blinding and message whitening, implementors may not perceive the leakage as a threat when such counter-measures are in place. One can therefore expect many of them to prefer more widely applicable code which includes the conditional subtraction, despite the existence of straightforward and efficient alternatives [12]. Secondly, the data loading cycles for multiplications and squarings are different and therefore vulnerable. For example, the Hamming weight of the words of the arguments may leak when they pass along the internal bus [7]. A squaring is almost certain where the Hamming weights are equal, and a multiplication must be the case if they are different. However, this information is usually well submerged in noise, and in a well designed implementation it should only yield a minimal bias towards a squaring or a multiplication. The above are very much more realistic leakage models than that of [3] where it was assumed that each multiplicative operation was known to be a squaring or a multiplication. In practice, only weak probabilistic information is known. In order to obtain specific measures of implementation strength, the attack here is modelled on the level of data leakage from observing every conditional subtraction in Montgomery modular multiplication. However, the attack is generic, and applies to both of the above scenarios as well as many others.
5
Selecting the Leakiest Traces
The word-based algorithm for Montgomery multiplication (MMM) is given in Fig. 2 where the digits ai , bi etc. are for the base r representation of the long integers A, B etc. From the assertion after the loop it is easy to establish the frequency of the conditional subtraction under the reasonable assumption of the output residues being uniformly distributed modulo N . The probability is proportional to the fraction of the interval which is greater than N , namely ABr−n N −1 . For a typical multiplication with independent arguments, this can be summed with respect to A and B over the interval [0, N ) to obtain the average probability of
308
C.D. Walter
pM ≈
1 −n Nr 4
(6)
Similarly, setting A = B and summing gives the probability of the subtraction for a squaring, namely 1 pS ≈ N r−n (7) 3 The difference between pM and pS shows that the occurrences of a conditional subtraction indicate a squaring is slightly more likely to be the case than a multiplication. The difference, however, is small. Early attacks on Montgomery’s algorithm relied on being able to perform hundreds or thousands of exponentiations with the same key in order to observe enough subtractions to conclude with high probability whether the operation was a squaring or a multiplication. The formulae (6) and (7) also indicate that decreasing N or increasing the number of iterations n will reduce the occurrences of the conditional subtraction and so make the algorithm more secure. However, the multiplications in individual exponentiations are not as random as used for the formula (6). For 4-ary sliding windows, one of the two precomputed powers of the input message is used as one of the arguments, the other being a random output from an earlier Montgomery multiplication. So only one argument is uniformly distributed in a given exponentiation. Let A be the fixed input to such a multiplication. Then, summing ABr−n N −1 with respect to B yields the true probability of a subtraction, viz. pA ≈
1 −n Ar 2
(8)
Thus, when the pre-computed powers of the input are small (resp. large) there will be very few (resp. many) conditional subtractions resulting from multiplications. This increases the probability of distinguishing between squarings and multiplications. Overall, this will be noticed by an adversary because the total number of conditional subtractions will be less (resp. greater) than the average for such exponentiations. This provides the opportunity for the adversary to select the leakiest traces with very little computational effort. Similarly, in the Hamming weight leakage scenario, instead of an enhanced or reduced frequency of conditional subtractions from large or small values of A, the adversary homes in on the argument pairs A, B which are the highest Hamming distance apart. They have the highest probability of being multiplications. This occurs most frequently when the re-used, pre-computed multiplier A has the highest number of extreme Hamming weights. So, by screening for extreme Hamming weights, traces which leak significantly more information than average can be identified easily by the adversary. In both leakage models, the attacker can therefore begin by selecting side channel traces which yield the greatest amount of information, and these can be chosen without excessive computational effort for the initial phase of data capture, signal processing and selection.
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
6
309
The Attack: Phase 1
In the leakage scenarios of §4, the attacker is expected to obtain little or no useful information about a multiplicative operation in many cases, and only a very weak probability in favour of a squaring rather than a multiplication (or vice versa) in other cases. However, as described in §5, he begins his attack by collecting as many traces as possible and selecting those for which the leakage promises to be greatest. Phase one of his attack then progresses as follows. Suppose he has selected a promising trace corresponding to the use of the blinded exponent Di . He first determines the top half of the digits of φ(N )/E as in §2 and then guesses the values of k and ri . Equation (5) gives him a possible approximation Di 0003 for Di . He then compares the side channel leakage expected from Di 0003 with that obtained from Di and discards the guessed pair (k, ri ) if the match is poor. Repeating this for all pairs leaves him with a set Si of the most likely blinding values for Di . This process is repeated with more traces – enough for him to complete the second phase successfully. The decision about whether guesses are good enough is based on a metric μ(tr(Di ), ops(Di 0003 , m)). The first parameter tr(Di ) is the processed side channel leakage from use of the unknown blinded key Di . Specifically, it is a list of probabilities pr(op) that the operations op of the exponentiation using Di were squares rather than multiplications. Thus tr(Di ) = [pr(ops ), pr(ops−1 ), .., pr(op3 ), pr(op2 ), pr(op1 )] where s = len(tr(Di )) is the total number of operations in the exponentiation with key Di . In the second parameter, Di 0003 is the bit sequence for the guessed value of Di , and ops(Di 0003 , m) is the sequence of multiplicative operations carried out in an exponentiation with key 0004Di 0003 /2m 0005. So the m least significant bits of Di 0003 are irrelevant, and need not have been guessed yet. This parameter will be a list containing, say, ‘0’ to denote a squaring, and ‘1’ a multiplication. In this phase we will set m = n/2+ log2 R. If the side channel leakage tr = tr(Di ) for Di indicates with probability trj that the jth operation was a squaring and the jth operation in ops(Di 0003 , m) is also a squaring then trj − 12 is added to the metric. However, if a multiplication occurs as the jth element in ops(Di 0003 , m), then 12 −trj is added. So 0 is added if the leakage provides no information since then trj = 12 , but there is a positive contribution to the sum when the operation in tr(Di ) is more likely to be the same as that in ops(Di 0003 , m), and there is a negative contribution when the two operations are more likely to be different. Thus, the sum for calculating the metric is μ(tr, ops(D0003 , m))
=
0003 1≤j≤nops(0005D0002 /2m 0006)
0002 1 (−1)ops(D ,m)j (trj − ) 2
(9)
when ops(D0003 , m)j ∈ {0, 1} as suggested above, and nops(D00030003 ) is the number of operations in exponentiating to the power D00030003 . Here the lists tr and ops(D0003 , m) are in temporal execution order assuming an exponentiation algorithm which processes bits from most to least significant. This correctly aligns corresponding elements of the lists when the lowest bits of D0003 are ignored.
310
C.D. Walter
If the guess (k, ri ) is correct, the value Di 0003 provides the same pattern of squarings and multiplications as Di over the first (approximately) half of the operations of the exponentiation. So, unless the noise is overwhelming, this should maximise the value for the sum when restricted to those operations. Therefore larger values for μ imply a better match between the guessed value Di 0003 and the targetted exponent Di , whereas smaller and negative values indicate a poor match. Because of the unknown difference between N and φ(N ), the n/2+ log2 R least significant bits of Di 0003 are unreliable even when (k, ri ) is guessed correctly. By taking m = n/2+ log2 R the operations corresponding to them are ignored. The metric could be improved by taking account of the dependence between consecutive operations: for example, the probable occurrence of a multiplication implies that the next operation is more likely to be a squaring. Schindler [9,11,10] treats this in detail and provides theory about the best choice of metric. 6.1
Phase 1 Simulation
In our simulation, values were chosen which correspond to a leaky implementation of MMM where every conditional subtraction is observed and N ≈ rn . Conditional subtractions were generated randomly with frequencies in accordance with the model described in §5. There was no selection of “better” traces on the grounds of fewer or more subtractions than normal. Since conditional subtractions occur with slightly greater frequency for squarings than for multiplications, the metric was (arbitrarily) incremented by pj − 12 = 0.1 for every conditional subtraction in the trace when there was a squaring in the guessed value and therefore incremented by 12 −pj = −0.1 (i.e. decremented) for every conditional subtraction that coincided with a multiplication. Table 1. Proportion of Guesses returning a Higher Value of μ than the Correct One log2 RE
8
n = 384 7.9×10−3 512 2.7×10−3 768 5.3×10−4 1024 2.0×10−5 1536 < 2.5×10−7
12
16
32
48
8.0×10−3 2.4×10−3 3.2×10−4 1.9×10−5 < 2.5×10−7
8.2×10−3 3.4×10−3 1.0×10−3 8.7×10−4 < 2.5×10−7
5.0×10−3 2.0×10−3 1.4×10−4 1.7×10−5 ..
3.6×10−3 1.0×10−3 1.2×10−4 8.0×10−6 ..
With only this weak knowledge to distinguish between squarings and multiplications, the “best” guess is rarely the correct one. The correct values are ranked among the best, but do not usually come top. Therefore, to assess the feasibility of the attack, it is necessary to know the size of the set S of best guesses which is big enough to include the correct guess. This depends on the strength of the leakage. With the parameters just described, the results in Table 1 were obtained. It gives a good indication of how well the matching process works and shows the minimum proportion of all guesses which must be considered if the correct one is not to be excluded. For example, with a modulus of n = 1024 bits, and
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
311
RE = 232 , the leakage of interest is from the top 512 or so bits. Then the metric μ places the correct values (k, ri ) above all but about RE×1.7×10−5 ≈ 216 incorrect values, on average. In information theoretic terms, the metric has extracted about 16 bits from the side channel, i.e. about 1 bit in every 512/16 = 32. This is the case for all the entries in the table: they all correspond to about 1 bit per 32 in the top half of the key, i.e. n/64 bits in total. An improved metric is possible (e.g. taking into account multiplications having to be next to squarings) and this would enable more information bits to be obtained. However, for 2048-bit keys (not tabulated), this means about 32 bits’ worth of information is recovered, so that k and ri should be determined almost uniquely when RE ≤ 232 . This is indeed what was found in the simulation. Clearly, longer keys are more vulnerable: – For a given size of blinding and public exponent, the longer the key, the more likely (k, ri ) is to be guessed correctly and uniquely. The figures in the table show little effect from increasing RE. k and ri blind information equivalent to about log2 RE bits’ worth of operations. However, longer blinding factors also seem to constrain the pattern of the blinded key more tightly. With these conflicting forces, the average success of the method is little changed: the number of bits leaked depends almost entirely on the length n of the key. Consequently, for a given key length, the same proportion of choices (k, ri ) are removed irrespective of the value of RE. Of course, the number of accepted pairs must still increase directly in proportion to RE. Thus, – Typical leakage from 2048-bit or longer keys will usually reveal (k, ri ) correctly with current standards for key blinding and a small public exponent; and – In these cases, an exhaustive search is computationally feasible to find the correct blinding factors (k, ri ). Incidentally, a powerful counter-measure in the case of Montgomery conditional subtractions is just to halve the modulus. This halves the number of conditional subtractions, and so halves the number of bits which are leaked. 6.2
Combining Traces to Determine k in Phase 1
The leakage from t traces can be processed for an outlay of t times the effort for one. If these traces are independent, t times as much bit information is extracted. Thus, a very small number of traces should result in k being determined with some confidence, since the same k is used in all cases. In fact, the correct value of k should have been guessed for all or almost all traces, and, if there is any bias, the correct value for k should be one of the most popular among the best guesses for an individual trace. Guesses at k are ranked as follows. For each sufficiently good guess k+ri E, the value of k = (k+ri E) mod E is extracted and the associated value of the
312
C.D. Walter
metric μ is added to the weighting of k. The higher the total weight for a guess at k, the more likely that value is to be correct. The possible values of k are then considered in descending order of weight in Phase 2, the heaviest first. Our simulation did not investigate how much this ranking reduces the search space in Phase 2 as a function of t; from the information theoretic point of view, it seems possible that k is almost completely determined by only a very small number of traces. This is an important detail that still needs to be researched as it affects the effectiveness of the blinding.
7
The Attack: Phase 2
Let Si be the set of plausible guesses at (k, ri ) for the ith trace, and suppose Si is partitioned into subsets Sik which share the same k. Armed with these sets, the adversary progresses to phase 2, which is the recovery of the remaining, least significant bits of φ(N ). He repeats this phase for each k separately, choosing the most likely k first. φ(N ) is constructed bit by bit from the most significant end. The first half of φ(N ) was obtained already from the public N and equation (1). Let φ(N )j = (φn−1 φn−2 ..φj )2 be the part of φ(N ) already determined, so φj−1 is the next bit to be guessed. Let Φj = (φj−1 φj−2 ..φj−w )2 be a guess at the next w bits of φ(N ). For each possible value of word Φj , the right side of equation (4) is evaluated with φ(N )j−w in place of φ(N ). This yields an approximation Dri ,j−w to Di in which only the most significant n−j+w bits are of interest. The same metric as in Phase 1 is used again to measure how well this matches the leakage from Di , namely μ(tr(Di ), ops(Dri ,j−w , j−w+ log2 R)). (As before, at the point before division by E, we ignore the lowest log2 RE bits containing a contribution from φj−w because they are too contaminated by the carries up from less significant bits of φ(N ).) For the given k, the sum 0003 0003 μ(tr(Di ), ops(Dri ,j−w , j−w+ log2 R)) (10) μw (k, j, Φj ) = i
ri ∈Sik
over all guesses is used to assess the worth of the choice for Φj . The leading bit of Φj from the maximum μw (k, j, Φj ) is selected as the value for φj−1 . Correct bit choices amplify any peak (i.e. maximum) values of the metric μw whilst incorrect choices decrease it. Moreover, previous mistakes reduce any peaks. When that happens, it is necessary to backtrack and select the most promising previous value. The difference between the two cases is determined using a threshold value for the metric which is obtained by experience. When it becomes too low for every value of Φj , it is necessary to backtrack and select the most promising previously untried value. The least significant bits of Φj are partly masked by carries up, and contribute less to the peak values than the more significant bits. So only the top one or two bits of the best Φj are chosen each time. In this way the bits φj are chosen from most to least significant. Once most bits have been guessed, the final log2 E bits are fully determined by the division being exact in equation (4).
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
313
Table 2. Probability of predicting the correct bit of φ(N ) from t correct guesses ri with w lookahead bits when n = 1024 and log 2 RE = 16 w t = 25 t = 50 t = 100 t = 250
7.1
1
2
3
4
6
8
0.613 0.642 0.673 0.706
0.767 0.819 0.846 0.863
0.833 0.896 0.922 0.930
0.868 0.939 0.954 0.971
0.914 0.973 0.981 0.991
0.930 0.989 0.994 0.995
Phase 2 Simulation
For the simulation it was assumed that the correct (k, ri ) had been chosen for each i, i.e. that k had been deduced correctly and for the ith trace only the correct ri had been selected. So Sik = 1 and Sik0002 = 0 if k 0003 = k. From the conclusions about Phase 1, this should usually be the case for long keys. As long as there is a reasonable probability of detecting the correct bit each time, all of φ(N ) can be determined. Typical probabilities can be seen in Table 2. There seems little to be gained from having more than 100 traces; more is achieved by having more lookahead bits. In fact, the probability of picking the wrong bit seems to fall exponentially as the number w of lookahead bits increases1 . From Table 3, w ≥ 8 allows a significant proportion of keys to be recovered if the k and the randoms ri have been guessed correctly. The figures are for an implementation of the algorithm without backtracking. When incorrect bits are predicted, the process does not recover and random bits are generated thereafter. With most bits being correct, backtracking is a cheaper alternative to solve this than increasing the number of lookahead bits. Assuming that Table 2 probabilities are constant over the length of the key and are independent of the key length, it is possible to compute the probability of successfully recovering the key: approximately pn/2 where p is the table entry and n the key length. Table 3 gives these probabilities as obtained from a simulation with 100 traces and w = 8. This corresponds to p = 0.9973. With 10 lookahead digits the simulation shows there is a 60% chance of recovering 2048-bit keys, and this corresponds to p = 0.999512. Lastly, with 50 traces but w varying dynamically between 8 and 16 as necessary, 2048-bit keys were recovered in 11% of cases. Since the values of k and ri from Phase 1 will be mostly correct for 2048-bit keys with log2 RE ≤ 32, – It is computationally feasible to recover a substantial number of 2048-bit keys using 50 traces, current standards for random blinding, typical small public exponents, and expected levels of weak side channel leakage. 1
The maximum values of μw (k, j, Φj ) were computed where Φj ranged over i) values with φj = 0 and ii) values with φj = 1. The difference between these was a good indicator of the reliability of the choice of φj . Increasing w just for the cases for which this difference was smallest led to a remarkable improvement in accuracy. Moreover, decreasing w for other cases led to a considerable computational saving. Many 2048-bit keys were recovered successfully using just 50 traces and varying w between 8 and 16.
314
C.D. Walter
Table 3. Probability of success in determining φ(N ) from t = 100, correct ri s and key length n with w = 8 lookahead bits, no back-tracking and log2 RE = 16
7.2
n
512
768 1024
1536
2048
prob
0.50
0.40
0.13
0.04
0.29
The Case of Some Incorrect Phase 1 Deductions
Now consider the case where not all pairs (k, ri ) are correct. If (k, ri ) is incorrect then the above process applied only to this pair (i.e. t = 1 and Sik = 1) would result in choosing the lower bits of φ(N ) to satisfy (4) with the incorrect values (k, ri ) and the correct Di . This makes the lower bits incorrect by a multiplication factor of (k 0003 +ri0003 E)/(k+ri E) where (k 0003 , ri0003 ) is the correct pair. Moreover, for these bit choices the metric retains the peak values associated with a correct choice. So, without the context of other traces, the error will remain undetected and the pair (k, ri ) cannot be removed from consideration. Thus, if the above process is performed with a set of pairs (k, ri ), some of which are correct and others incorrect, then the incorrect values predict random bits, while the correct ones predict the correct bits. This averages to a weaker prediction of the correct bits. However, the incorrect choices become more apparent as more correct bits are appended to φ(N ). Eventually this is noticed and those choices can be dropped to speed up the process. It is easy to choose threshold values for the metric – several standard deviations below the average, say – to guide this decision. So the Phase 2 process is applied to all the outputs of Phase 1 for a given k, i.e. every (k, ri ) ∈ Sik for every trace, and the sum of all the metric values is used to choose the lower bits of φ(N ). Clearly, however, the limiting factor in this phase is the ratio of correct to incorrect predictions (k, ri ). If this is too small it will not be possible to identify correct bits through peaks in the value of the metric. Table 1 shows that key length is a very strong contributor to this: longer keys improve the ratio, making recovery of φ(N ) much easier. 7.3
Comparison with Fouque
In this algorithm the bits of φ(N ) are determined in the reverse order from that used by Fouque et al. [3]. This has several advantages. It makes the transition between the known upper half of φ(N ) and unknown lower half seamless, it allows the metric easily to include the value of all previous decisions, and it allows the division by E to be done straightforwardly. The problems of carry influence in the multiplications of equation (4) is similar for both directions.
8
Complexity
The first phase has time complexity O(REt log(RE)) where t is the number of traces needed to complete the second phase, and depends on the level of leakage.
Longer Randomly Blinded RSA Keys May Be Weaker Than Shorter Ones
315
This complexity results from an exhaustive search over all possible (k, ri ). It was remarked that there was an information leakage which is proportional to the length of the traces. Therefore, recovering the log2 (RE) bits of each (k, ri ) only requires processing a part of the traces with length O(log(RE)), not the whole length. Space is not an issue in this phase as only one pair (k, ri ) need be considered at any one time. The pairs are treated independently and so the work can be completely parallelised. For standard choices of E = 216 +1, R = 232 and a similar level of leakage to the example, this is clearly computationally feasible. In the second phase the worst situation is that all RE guesses are considered for every trace at each bit selection, making a total time complexity O(REnt), which is at worst similar to the first phase. However, if only R0003 E 0003 guesses survive then the complexity is reduced to O(R0003 E 0003 nt). This assumes that metrics do not have to be recomputed over the whole length of the trace every time another bit is guessed; instead the incremental effect of the new bit is used to update the preceding value. This approach requires O(R0003 t) space as different values of k are processed sequentially. The second phase requires strong leakage or a high ratio of correct pairs (k, ri ) to have a chance of working. Therefore practical limitations on the number of traces that can be obtained guarantees that space will not be the overriding problem. Furthermore, the work can be parallelised without difficulty at least as far as distributing the effort for each k to different processors. This would reduce the time complexity by a factor of O(E).
9
Conclusion
The scope of the attack of Fouque et al. [3] has been extended to include imprecise leakage by introducing a practical metric which prioritises the selection of guesses at the random blinding factors and bits of φ(N ) for an RSA modulus N . Both attacks target the typical set-up for RSA decryption/signing in a smartcard with standard counter-measures which include exponent blinding. It was found that very weak, imprecise leaked data could be successfully manipulated to reduce the ambiguity in the blinding factors by a factor essentially proportional to the length of the keys, so that the blinding factors are fully determined when the key is long enough. For typical choices of public exponent and blinding parameters, and a leakage rate equivalent to only 1 bit per 32 bits of key per trace, the blinding factors can be recovered correctly for keys above about 2048 bits in length. Reconstruction of the unknown lower bits of φ(N ) requires most of the blinding factors to be recovered correctly and sufficiently many traces to be available. With a leakage rate of 1 bit per r key bits, 1.5r traces suffice to recover φ(N ) and hence factor N without any need for an expensive search. In a simulation, a sizeable proportion of 2048-bit keys were successfully recovered using leakage from only 50 traces (r=32). Thus the attack is certainly computationally feasible with only weak, imprecise leakage. So longer keys were found to be more vulnerable. The best counter-measure is to ensure that blinding increases with key length at least until it becomes
316
C.D. Walter
computationally infeasible to test every blinding value individually. The attack illustrates that the information theoretic level of leakage can be into converted successfully into the secret key even in the presence of a typical collection of standard counter-measures.
References 1. Portable Data Carrier including a Microprocessor, Patent 4211919, US Patent and Trademark Office (July 8, 1980) 2. Dhem, J.-F., Koeune, F., Leroux, P.-A., Mestr´e, P., Quisquater, J.-J., Willems, J.L.: A practical implementation of the Timing Attack. In: Schneier, B., Quisquater, J.-J. (eds.) CARDIS 1998. LNCS, vol. 1820, pp. 175–190. Springer, Heidelberg (2000) 3. Fouque, P.-A., Kunz-Jacques, S., Martinet, G., Muller, F., Valette, F.: Power Attack on Small RSA Public Exponent. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249, pp. 339–353. Springer, Heidelberg (2006) 4. Knuth, D.E.: The Art of Computer Programming, 3rd edn. Seminumerical Algorithms, vol. 2. Addison-Wesley, Reading (1997) 5. Ko¸c, C ¸ .K.: High Radix and Bit Recoding Techniques for Modular Exponentiation. International J. of Computer Mathematics 40(3-4), 139–156 (1991) 6. Kocher, P.: Timing attack on implementations of Diffie-Hellman, RSA, DSS, and other systems. In: Koblitz, N. (ed.) CRYPTO 1996. LNCS, vol. 1109, pp. 104–113. Springer, Heidelberg (1996) 7. Kocher, P., Jaffe, J., Jun, B.: Differential Power Analysis. In: Wiener, M.J. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999) 8. Montgomery, P.L.: Modular Multiplication without Trial Division. Mathematics of Computation 44(170), 519–521 (1985) 9. Schindler, W.: A Combined Timing and Power Attack. In: Naccache, D., Paillier, P. (eds.) PKC 2002. LNCS, vol. 2274, pp. 263–279. Springer, Heidelberg (2002) 10. Schindler, W.: On the Optimization of Side-Channel Attacks by Advanced Stochastic Methods. In: Vaudenay, S. (ed.) PKC 2005. LNCS, vol. 3386, pp. 85–103. Springer, Heidelberg (2005) 11. Schindler, W., Walter, C.D.: More detail for a Combined Timing and Power Attack against Implementations of RSA. In: Paterson, K.G. (ed.) Cryptography and Coding. LNCS, vol. 2898, pp. 245–263. Springer, Heidelberg (2003) 12. Walter, C.D.: Precise Bounds for Montgomery Modular Multiplication and Some Potentially Insecure RSA Moduli. In: Preneel, B. (ed.) CT-RSA 2002. LNCS, vol. 2271, pp. 30–39. Springer, Heidelberg (2002) 13. Walter, C.D., Thompson, S.: Distinguishing Exponent Digits by Observing Modular Subtractions. In: Naccache, D. (ed.) CT-RSA 2001. LNCS, vol. 2020, pp. 192–207. Springer, Heidelberg (2001)
Differential Power Analysis of HMAC Based on SHA-2, and Countermeasures Robert McEvoy, Michael Tunstall, Colin C. Murphy, and William P. Marnane Department of Electrical & Electronic Engineering, University College Cork, Ireland {robertmce,miket,cmurphy,liam}@eleceng.ucc.ie
Abstract. The HMAC algorithm is widely used to provide authentication and message integrity to digital communications. However, if the HMAC algorithm is implemented in embedded hardware, it is vulnerable to side-channel attacks. In this paper, we describe a DPA attack strategy for the HMAC algorithm, based on the SHA-2 hash function family. Using an implementation on a commercial FPGA board, we show that such attacks are practical in reality. In addition, we present a masked implementation of the algorithm, which is designed to counteract first-order DPA attacks.
1
Introduction
In today’s modern society of e-mail, internet banking, online shopping and other sensitive digital communications, cryptography has become a vital tool for ensuring the privacy of data transfers. To this end, Message Authentication Code (MAC) algorithms are used to verify the identity of the sender and receiver, and to ensure the integrity of the transmitted message. These algorithms process the message to be authenticated along with a secret key, which is shared between the sender and receiver. The result is a short string of bits, called a MAC. HMAC [1] is a popular type of MAC algorithm which is used in the IPsec [14] and TLS protocols [6], and is based on a cryptographic hash function such as SHA-2 [16]. The last decade has also seen the emergence of attacks which target cryptographic algorithms that are implemented in embedded hardware [9]. Of particular interest are differential side-channel attacks, such as Differential Power Analysis (DPA) [12]. These non-invasive attacks exploit information that leaks from a cryptographic device via some side channel, such as timing information, power consumption, or electromagnetic emanations. Comparing small variations in the side-channel information as a device processes different messages can potentially allow an attacker to recover secret information stored within the device. In this paper, we examine the susceptibility to differential side-channel attacks of the HMAC algorithm based on the SHA-2 family of hash functions. Side-channel attacks on hash functions and the HMAC algorithm have been discussed in the past, but specific attack details for the SHA-2 family have not been given, nor have countermeasures been designed. In 2001, Steinwandt et al. [21] presented a theoretical attack on the SFLASH signature scheme, which S. Kim, M. Yung, and H.-W. Lee (Eds.): WISA 2007, LNCS 4867, pp. 317–332, 2007. © Springer-Verlag Berlin Heidelberg 2007
318
R. McEvoy et al.
targeted an exclusive-OR (XOR) operation in SHA-1. Coron and Tchoulkine [5] noted the vulnerability of the HMAC algorithm to a DPA attack. Lemke et al. [10] described a theoretical DPA attack on the HMAC algorithm based on the hash function RIPEMD-160, noting that a similar approach could be taken for a HMAC scheme based on SHA-1. Okeya et al. [18,19] highlight the susceptibility of MAC and HMAC algorithms to side-channel attacks, but the exposition is for the HMAC algorithm based on block-cipher based hash functions, in contrast with SHA-2, which is a dedicated cryptographic hash function. In this paper, we characterise a differential side-channel attack on an implementation of the HMAC algorithm that uses the SHA-2 hash function family. Furthermore, we provide attack results on a FPGA implementation of the algorithm. We also describe countermeasures that could be used to prevent such side-channel attacks, by designing masked circuits for the vulnerable SHA-2 operations. The rest of this paper is organised as follows. In Section 2, the necessary background theory regarding the HMAC algorithm, the SHA-2 family, and DPA attacks is introduced. Section 3 gives a detailed account of how the SHA-256 based HMAC scheme can be broken by a side-channel attacker. Results from a practical FPGA-based implementation of this attack are presented in Section 4. In Section 5, a masking scheme is designed as a countermeasure against the attack, and the resulting FPGA-based scheme is tested in Section 6. Section 7 concludes the paper.
2 2.1
Background Theory HMAC Algorithm Overview
The HMAC authentication scheme was first introduced by Bellare et al. at CRYPTO’96 [1]. The scheme was designed such that the security of the MAC is built upon the security of the underlying hash function h. The MAC is calculated as follows: (1) HMACk (m) = h((k ⊕ opad) h((k ⊕ ipad) m)) where k is the secret key (padded with zeros to equal the block size of h), and m is the message to be authenticated. ipad is a fixed string whose length equals the block size of h; generated by repeating the hexadecimal byte 0x36. Similarly, opad is fixed and is formed by repeating the hexadecimal byte 0x5C. ⊕ and denote XOR and concatenation respectively. Clearly, in order to calculate HMACk (m), the hash function h must be invoked twice. In this paper, we focus on the first call to the hash function, which calculates the partial MAC: HMAC0002k (m) = h((k ⊕ ipad) m)
(2)
In [1], the authors suggested using MD5 or SHA-1 to instantiate the hash function h. In 2002, the HMAC algorithm was released as a standard by NIST [17], in which h is defined as a NIST-approved hash function. In this paper, we adhere to this standard and choose SHA-256 to instantiate h. This follows a recent
Differential Power Analysis of HMAC
319
trend in the cryptographic community away from older hash functions, for which weaknesses have been identified [11], and towards newer constructions like the SHA-2 family [16]. We use the term “HMAC-SHA-256” to denote the HMAC algorithm that uses SHA-256 to instantiate h. 2.2
SHA-256 Description
There are four hash functions in the SHA-2 family: SHA-224, SHA-256, SHA384 and SHA-512. Each algorithm generates a fixed-length hash value; SHA-224 produces a 224-bit output, SHA-256 has a 256-bit output, etc. The compression functions in SHA-224 and SHA-256 are based on 32-bit operations, whereas the compression functions for SHA-384 and SHA-512 are based on 64-bit operations. We focus on SHA-256 in our attacks, because it is easier in practice to perform a side-channel attack on a 32-bit word than on a 64-bit word. However, in theory, the side-channel attacks and countermeasures described in this paper should also be applicable to HMAC-SHA-384 and HMAC-SHA-512. The SHA-256 algorithm essentially consists of three stages: (i) message padding and parsing; (ii) expansion; and (iii) compression. Message Padding and Parsing. The binary message to be processed is appended with a ‘1’ and padded with zeros until its bit length ≡ 448 mod 512. The original message length is then appended as a 64-bit binary number. The resultant padded message is parsed into N 512-bit blocks, denoted M (i) , for 1 ≤ i ≤ N . These M (i) message blocks are passed individually to the message expansion stage. Message Expansion. The functions in the SHA-256 algorithm operate on 32-bit words, so each 512-bit M (i) block from the padding stage is viewed as (i) sixteen 32-bit blocks denoted Mt , 1 ≤ t ≤ 16. The message expansion stage (also called the message scheduling stage) takes each M (i) and expands it into sixty-four 32-bit Wt blocks for 1 ≤ t ≤ 64, according to equations given in [16]. Message Compression. The Wt words from the message expansion stage are then passed to the SHA compression function, or the ‘SHA core’. The core utilises eight 32-bit working variables labelled A, B, . . . , H, which are initialised (0) (0) to predefined values H0 –H7 (given in [16]) at the start of each call to the hash function. Sixty-four iterations of the compression function are then performed, given by: A = T1 0002 T2 (3) E = D 0002 T1 (7) B=A (4) F =E (8) C=B D=C where T1 = H 0002
(5) (6) 0002 1
G=F H =G
(E) 0002 Ch(E, F, G) 0002 Kt 0002 Wt
(9) (10)
(11)
320
R. McEvoy et al.
T2 =
0002 0
(A) 0002 M aj(A, B, C)
(12)
Ch(x, y, z) = (x ∧ y) ⊕ (¯ x ∧ z) M aj(x, y, z) = (x ∧ y) ⊕ (x ∧ z) ⊕ (y ∧ z) 0002 (x) = ROT2 (x) ⊕ ROT13 (x) ⊕ ROT22 (x) 0 0002 (x) = ROT6 (x) ⊕ ROT11 (x) ⊕ ROT25 (x)
(13) (14) (15) (16)
1
and the inputs denoted Kt are 64 32-bit constants, specified in [16]. All additions in the SHA-256 algorithm are computed modulo 232 , denoted by 0002. The logical AND operator is denoted by ∧, and x¯ denotes the logical NOT operator. After 64 iterations of the compression function, a 256-bit intermediate hash value H (i) , (i) (i) comprising H0 –H7 , is calculated: (i)
(i−1)
H0 = A 0002 H 0
(i)
(i−1)
, H1 = B 0002 H1
(i)
(i−1)
, . . . , H7 = H 0002 H7
(17)
The SHA-256 compression algorithm then repeats and begins processing another 512-bit block from the message padder. After all N data blocks have been processed, the output, H (N ) , is formed by concatenating the final hash values: (N )
H (N ) = H0 2.3
(N )
H1
(N )
H2
(N )
. . . H7
(18)
Differential Side-Channel Analysis
Some of the most common forms of side-channel analysis are Differential Power Analysis (DPA) [9] and related attacks such as Correlation Power Analysis (CPA) [2]. In this type of attack, a series of power consumption traces are acquired using an oscilloscope, where each trace has a known associated input (e.g. the message block being processed). A comprehensive guide to this class of attacks is provided in [12]. The fundamental principle behind all DPA attacks is that at some point in an algorithm’s execution, a function f exists that combines a fixed secret value with a variable which an attacker knows. An attacker can form hypotheses about the fixed secret value, and compute the corresponding output values of f by using an appropriate leakage model, such as the Hamming Distance model [2]. The attacker can then use the acquired power consumption traces to verify her hypotheses, by partitioning the acquisitions or using Pearson’s correlation coefficient. These side-channel analysis attacks are aided by knowledge of details of the implementation under attack. Moreover, these attacks can be used to validate hypotheses about implementation details. In subsequent sections, these side-channel analysis attacks are referred to as DPA attacks.
3
Attacking HMAC-SHA-256
In this section, we describe an attack on HMAC-SHA-256 using DPA. This attack does not allow recovery of the secret key itself, but rather a secret intermediate
Differential Power Analysis of HMAC
321
state of the SHA-256 hash function. Knowledge of this intermediate state would allow an attacker to forge MACs for arbitrary messages. We note that the attack is not limited to DPA, and other side-channels, such as the electromagnetic sidechannel, could also be used. 3.1
Goal of the Attack
We assume that the attacker has access to a device that performs the HMAC algorithm, and that she has knowledge of the messages being processed by the device. This target device contains a basic implementation of the SHA-256 algorithm, and does not include any side-channel analysis countermeasures. Furthermore, we assume that the attacker has access to some side-channel information (e.g. the power consumption) while the device is calculating the MAC, which leaks the Hamming Distance between the internal signals as they change from one state to the next. As is common, we assume that the secret key is stored in a secure environment, which does not leak side-channel information. The attack focuses on the first execution of SHA-256, given in equation (2). The block size of SHA-256 is 512 bits; therefore, using the notation from Section 2.2, k = ipad = 512. Without loss of generality, we can assume that the size of the message m is such that N = 2, i.e. the device will run through the 64 iterations of the compression function twice, in order to calculate equation (2). When i = 1, the hash function is operating on (k ⊕ ipad), which clearly does not change from one execution of the HMAC algorithm to the next. Hence, the intermediate hash H (1) is also fixed and unknown. Recall that in order to perform a differential side-channel attack, we require fixed unknown data to be combined with variable known data. This criterion is fulfilled during the calculation of H (2) , when the variable m is introduced and combined with the previous intermediate hash H (1) . Therefore, in theory, a differential side-channel attack could be launched on a device calculating equation (2), in order to recover H (1) . This knowledge would allow the attacker to create partial MACs of her choice. Reapplying the side-channel attack on the second invocation of SHA256 in the HMAC algorithm would allow the attacker to forge full MACs for messages of her choosing. Consequently, the goal of the attacker is to recover the secret intermediate hash value H (1) . 3.2
Attack Strategy
The secret intermediate hash H (1) manifests itself as the initial values of the eight 32-bit working variables A–H, when i = 2. We use the subscript t, 1 ≤ t ≤ 64 to denote the round number, e.g. A1 refers to the value of A at the start of round 1 of the compression function, etc. The side-channel attacker’s goal is to uncover the eight variables A1 –H1 . A strategy for such an attack is now described. 1. With reference to equations (3) and (7), it is clear that at some point in the first round, the variable T 11 must be calculated. T 1t is a large sum with 5 terms, and can be re-written as: T 1t = θt 0002 Wt
(19)
322
R. McEvoy et al.
where θ t = Ht 0002
0002 1
(Et ) 0002 Ch(Et , Ft , Gt ) 0002 Kt
(20)
In round 1, θ1 is fixed and unknown, and W1 is known by the attacker, since it is related to m. Therefore, a DPA attack can be launched by making hypotheses about θ1 , and computing the corresponding values of T 11. Since SHA-256 uses 32-bit words, 232 hypotheses for θ1 are required. Furthermore, since we assume that the target device leaks the Hamming Distance (HD), the 232 possibilities for the previous state, T 10, must also be taken into account. Therefore, the attacker correlates the power traces with her 264 hypotheses for HD(T 10 , T 11). This allows the attacker to recover T 10 and θ1 , and then calculate T 11 for any message m. Clearly, correlating with 264 hypotheses would be computationally infeasible, even for well-resourced attackers. In Section 3.3, we describe how the Partial CPA technique [22] can be used to significantly reduce the attack’s complexity. 2. The above attack stage gives the attacker control over the value of T 11 , so it is now a known variable. Using equation (7), the attacker can now make hypotheses on the (fixed) bits of D1 , using the bits of E2 as selection bits. Using the Hamming Distance model, hypotheses for the previous (secret) state E1 are also generated. In this way, the attacker can recover her first secrets, D1 and E1 , and accurately predict the value of E2 for any message m. 3. Focusing on equation (3), we observe that T 11 is variable and known, whereas T 21 is fixed and unknown. The attacker can launch a DPA attack on A2 by forming hypotheses about T 21 and the previous state A1 . Hence, the secret value of A1 is revealed. Furthermore, with knowledge of both T 11 and T 21 , the attacker can now accurately predict A2 for any message m. Therefore, by analysing the side-channel signals from the first round, the attacker can recover the fixed secret values of θ1 , D1 , E1 , T 21 and A1 , and also predict the values of variables T 11 , A2 and E2 . 4. The attacker now turns her attention to the second SHA-256 round. Here, the Ch function is calculated as: Ch(E2 , F2 , G2 ) = (E2 ∧ F2 ) ⊕ (E2 ∧ G2 )
(21)
where E2 is variable, and known by the attacker. From equations (8) and (9), we observe that F2 and G2 are fixed at E1 and F1 , respectively. Therefore, the attacker can generate hypotheses about the unknown values F1 , and attack the output of the Ch function. Of course, 232 hypotheses for the previous state Ch(E1 , F1 , G1 ) are also required. Recovering F1 means that the attacker can now accurately predict the variable Ch(E2 , F2 , G2 ). 5. The next point of attack is the calculation of T 12 (equation (11)). At this stage, the only fixed unknown value in the equation is H2 , as every other variable can be predicted. The attacker already knows the previous state T 11 from stage 1 above. Mounting a DPA attack uncovers H2 , and allows
Differential Power Analysis of HMAC
323
T 12 to be predicted. From equation (10), it can be seen that H2 is equivalent to G1 . 6. The knowledge of T 12 gained from the previous attack stage can be applied to equation (7). Using the bits of E3 as the selection function, the attacker can mount a DPA attack that uncovers D2 . From equation (6), we observe that D2 is equivalent to C1 . 7. The M aj function in the second round can be expressed as: M aj(A2 , B2 , C2 ) = (A2 ∧ B2 ) ⊕ (A2 ∧ C2 ) ⊕ (B2 ∧ C2 )
(22)
where A2 is variable, and known by the attacker. From equations (4) and (5), we observe that B2 and C2 are fixed at A1 and B1 , respectively. Using a similar approach to that taken in stage 4 above, the attacker can perform DPA on M aj and discover the secret value of B1 . 8. By following the above strategy, the attacker can recover the fixed secrets A1 –G1 . The last remaining secret variable, H1 , can be found by reverting the focus to round 1, and substituting into equation (11), where the only unknown value is that of H1 . The eight 32-bit secret values are thus recovered, using seven first-order DPA attacks. 3.3
Complexity of the Attack
As noted above, it is currently computationally infeasible for an attacker to compute 264 hypotheses for a DPA attack. However, as indicated in [2] and illustrated in [22], a partial correlation may be computed, rather than the full correlation. If a correlation coefficient of ρ is obtained by correctly predicting all 32 bits of an 0003 intermediate variable, then we would expect to obtain a partial correlation of ρ l/32 by predicting l bits correctly. Hypotheses can be made on smaller sets of bits at a time, e.g. l = 8, and this strategy can be employed to keep only those hypotheses that produce the highest partial correlations. In this way, the full 32-bit correlation can be built up in stages, thereby reducing the complexity of the attack. This is similar to the ‘extend-and-prune’ strategy employed by a template attack [3].
4 4.1
Attack on FPGA Implementation Implementation Details
In order to demonstrate the feasibility of a DPA attack on HMAC-SHA-256, we implemented the algorithm on a Xilinx FPGA Board. FPGAs are attractive for implementing cryptographic algorithms because of their low cost (relative to ASICs), and their flexibility when adopting security protocol upgrades. FPGAs also allow rapid prototyping of various designs. For our experiments, we implemented SHA-256 on Xilinx’s low-cost Spartan™-3E Development Kit, which contains the Spartan™XC3S500E FPGA [23]. FPGAs consist mostly of Configurable Logic Blocks (CLBs), arranged in a regular array across the chip. In our
324
R. McEvoy et al. 0.3
8 bits 12 bits 16 bits 20 bits 24 bits 28 bits 32 bits
0.25
0.2
Correlation
0.15
0.1
0.05
0
−0.05
−0.1
0
0.5
1
1.5
Time
2
2.5
3 4
x 10
Fig. 1. Correlation and partial correlations between the power consumption and E2 , given the correct prediction of D1 and the previous state E1
case, each CLB contains 4 logic “slices”, which is the basic unit used to quantify the area occupied by a design. Each slice contains two four-input Look-Up Tables (LUTs) and two registers. Logic also exists within a slice which allows fast implementation of carry look-ahead addition. Each slice has a dedicated multiplexer, which is hard-wired to provide a fast carry chain between consecutive slices and CLBs. Indeed, fast carry logic is a feature of many modern FPGAs. We will make use of this dedicated addition circuitry in Section 5. Several optimisations for the SHA-2 family exist, such as pipelining and loop unrolling [4]. However, for simplicity, it was decided to implement a basic design without such optimisations. The design was captured using VHDL, and synthesis, placing and routing were performed using Xilinx ISE™v9.1i. The processor and interface circuitry utilise 951 slices, corresponding to 20% of the FPGA resources. The critical path in the design (i.e. the longest combinational path) is 16.4 ns, during the calculation of A (equation (3)). Block RAMs (BRAMs) on the FPGA are used to store the various messages to be processed; this reduces the communication requirements with the FPGA board. 4.2
Experimental Results
In order to obtain DPA power traces from the design, the FPGA board was configured with the basic SHA-256 design, and a 10 Ω resistor was inserted in the FPGA power supply line. Using a LeCroy WaveRunner 104Xi oscilloscope and a differential probe, we could measure the voltage fluctuations across the resistor as the SHA-256 algorithm was being executed. Therefore, the traces recorded on the oscilloscope were proportional to the power consumption of the FPGA during the execution of the algorithm. Traces for the first three rounds were captured while 4000 random messages were being processed by the FPGA. In order to reduce acquisition noise, each captured trace corresponded to the average of 700 executions with the same message. Figure 1 shows the correlation trace achieved when D1 and the unknown previous state E1 are correctly predicted.
Differential Power Analysis of HMAC
325
The different levels correspond to the correlation coefficients achieved when a certain number of bits are correctly predicted.
5
Masking the SHA-256 Algorithm
The preceding sections have demonstrated the susceptibility of hardware implementations of the HMAC algorithm to first-order DPA attacks. We now examine how to use masking as a countermeasure to such attacks. The masking technique aims to use random values to conceal intermediate variables in the implementation of the algorithm, thereby making the side-channel leakage independent of the secret intermediate variables. Much of the literature has focused on masking techniques for software implementations of cryptographic algorithms (e.g. [5,8,15]). In [7], Goli´c detailed techniques for masking hardware implementations, which we build upon below in order to mask HMAC-SHA-256. 5.1
Requirements
Consider a function f and intermediate variables x, y and z, such that z = f (x, y). If x or y are key-dependent ors personalized experience, the evaluation of recommendation credibility is depending on the common set of peers that have interacted with requestor and the recommendatory peers. As the increase of peers’ quantity, the common set is always very small [14]. Eigentrust [3] considers the recommendation credibility as being equal to the service trust. This metric is not suitable in circumstances where a peer may maintain a good reputation by providing high quality services but send malicious feedbacks to its competitors. Research [21] presents a new recommendation credibility calculation model, but there exists unfairness to blameless peers. Research [6] proposes the weighted majority algorithm (WMA), the main idea is to assign and tune the weights so that the relative weight assigned to the successful advisors is increased and the relative weight assigned to the unsuccessful advisors is decreased. But, the approaches mentioned above don’t consider more complex malicious strategies, for example, peers could try to gain trust from others by telling the truth over a sustained period of time and only then start lying, colluding peers could inflate reputation using unfair ratings flooding. Moreover, a peer may be penalized for an incorrect opinion that was based on a small number of interactions and/or a large variation in experience. Then, honest peers will be falsely classified as lying. Truthtelling Mechanisms. In order to stimulate reputation information sharing and honest recommendation elicitation, Jurca and Faltings [8] propose an incentive compatible reputation mechanism to deal with inactivity and lies. A client buys a recommendation about a service provider from a special broker named R-nodes. After interacting with the provider, the client can sell its feedback to the same R-node, but gets paid only if its report coincides with the next client’s report about the same service provider. One issue is that if the recommendation from an R-node is negative such that a client decides to avoid the service provider, the client will not have any feedback to sell. Or in the existence of opportunistic service providers that, for example, behave and misbehave alternatively, an honest feedback does not ensure payback. This opens up the possibility of an honest entity to have negative revenue and thus is unable to buy any recommendation. Besides, the effectiveness of their work depends largely on the integrity of R-nodes, which is assumed to be trusted a priori. To encourage the exchange of reputation information, Pinocchio [16] rewards participants that advertise their experience to others and uses a probabilistic honesty metric to detect dishonest users and deprive them of the rewards. The reward can be used to query the reputation of others. Pinocchio does not intend to protect against conspiracies or bad-mouthing. Research [17] proposes a mechanism for providing the incentives for reporting truthful feedback in a P2P system for exchanging services. Under their approach, both transacting peers submit ratings on performance of their mutual transaction. If these are in disagreement, then both transacting peers are punished, since such an occasion is a sign that one of them is lying. The severity of
374
J. Chang et al.
each peer’s punishment is determined by his corresponding non-credibility metric; this is maintained by the mechanism and evolves according to the peer’s record. But their proposal still avoids the fundamental problem that peers have no incentive to provide reputation feedback. Even peer can provide feedback, but, obviously, malicious peers may collude to weaken the mechanism (two colluding peers can provide the consistent rating for each other to increase their reputation value.). In contrast to these works, we propose a fully distributed mechanism based on local knowledge that provides malicious and non-participating entities an incentive for participation and honest behavior.
3 Incentive-Compatible Reputation Mechanism We adopt the terminology witness to denote a peer solicited for providing its recommendation. Finding the right set of witnesses is a challenging problem since the reputation value depends on their recommendations. Our approach for collecting recommendations follows the solution proposed by Yu et al [6], in which recommendations are collected by constructing chains of referrals through which peers help one another to find witnesses. In order to stimulate peers to send sufficiently honest recommendations, we make some changes (see Sect. 3.3 for details). In this section, we first introduce our trust valuation algorithm used by a peer to reason about trustworthiness of other peers based on the available local information which includes past interactions and recommendations received from witnesses. Then, we present our recommendation credibility calculation model, which can effectively detect dishonest recommendations in a variety of scenarios where more complex malicious strategies are introduced. Last, a simple yet effective trust information exchange protocol is proposed to stimulate sufficiently honest recommendations. 3.1 Trust Evaluation Algorithm In respect that peers may change their behaviors over time, and the earlier ratings may have little impact on the calculation result, it is desirable to consider more recent ratings, that is, to consider only the most recent ratings, and discard those previous ones. Such a restriction is motivated by game theoretic results and empirical studies on ebay that show that only recent ratings are meaningful [15]. Thus, in the following, only interactions that occur within a sliding window of width D are considered. Moreover, by doing so, the storage costs of our reputation system are reasonable and justified given its significant benefits. There are two kinds of trust relationships among peers, namely, direct trust and indirect trust [19]. The direct trust of peer i to peer j can be evaluated from the direct transaction feedback information between i and j. The indirect trust of i to j can be computed according to the recommendations of peers who have interacted with j. The overall trust of peer i to peer j is produced by combining these two trust value. Direct Trust. Any peer in our system will maintain a direct trust table for all the other peers it has interactions with directly. Suppose peer i has some interactions with peer j during the last D time units, the entry for peer j is denoted as
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems
375
E xp ( i , j ) = n , E X P _ L IS T , where n is the number of ratings, and E X P _ L IS T is an index in which these ratings are kept. The rating is in the form of r = (i, j, t, v). Here i and j are the peers that participated in interaction, and v is the rating peer i gave peer j. The range of v is [0, 1], where 0 and 1 means absolutely negative, absolutely positive respectively. t is the time stamp of the interaction. A rating is deleted from the direct trust table after an expiry period D. From the direct trust table, the direct trust valuation of peer i to peer j at time t is represented as < D ijt , ρ ijt >, where D ijt is the direct trust value and ρ ijt is introduced
to express the level of confidence about this direct trust value. Although there are a lot of elements that can be taken into account to calculate the level of confidence, we will focus on two of them: the number of experiences used to calculate the direct trust value and the variability of its rating values. D ijt is calculated by the following formula: t ij
D
∑ = ∑
e∈ Exp ( i , j ). EXP _ LIST
e.v ∗ α ( t − e.t )
α ( t − e .t ) e∈Exp ( i , j ). EXP _ LIST
(1)
Where e.v is the value of the rating e, and α is the decay factor in the range of (0, 1). A malicious node may strategically alter its behavior in a way that benefits itself such as starting to behave maliciously after it attains a high reputation. In order to cope with strategic altering behavior, the effect of an interaction on trust calculation must fade as new interactions happen [10]. This makes a peer to behave consistently. So, a peer with large number of good interactions can not disguise failures in future interactions for a long period of time. Let CIN ijt be the level of confidence based on the number of ratings that have been taken into account in computing D ijt . As the number of these ratings ( E x p ( i , j ). n ) grows, the level of confidence increases until it reaches a defined threshold (denoted by m). ⎧ E xp ( i , j ).n ⎪ C IN ijt = ⎨ m ⎪⎩1
if E xp ( i , j ).n ≤ m
(2)
otherw ise
Hence, the level of confidence CIN ijt increases from 0 to 1 when the number of ratings E x p ( i , j ) . n increases from 0 to m, and stays at 1 when E x p ( i , j ) . n exceeds m. Let CID ijt be the level of confidence based on the variability of the rating values.
CIDijt is calculated as the deviation in the ratings’ values: C ID ijt = 1 −
1 ∗ 2
∑
e∈ Exp ( i , j ). EX P _ LIST
α ( t − e .t ) ∗ e .v − D ijt
∑ e∈ Exp ( i , j ). E XP _ L IST α (t − e .t )
(3)
376
J. Chang et al.
This value goes from 0 to 1. A deviation value near 0 indicates a high variability in the rating values (this is, a low confidence on the direct trust value) while a value close to 1 indicates a low variability (this is, a high confidence on the direct trust value). Finally, the level of confidence
ρijt
about the direct trust value
Dijt combines the
two reliability measures above:
ρijt = CINijt ∗ CIDijt
(4)
Indirect trust. After collecting all recommendations about peer j using the rating discovery algorithm proposed by Yu et al [6], peer i can compute the indirect trust about peer j. Let Ti = {p1,p2,…,pti} be the set of trustworthy peers which reply the request. If p k ∈ T i had at least one service interaction with pj, it replies recommendation < Re c kjt , CI kjt > based on the local rating records with peer j. For an honest peer p, we have R e c kt j = D kt j and CI kjt = ρ kjt . The inclusion of level of confidence in the recommendation sent to the recommendation requestor allows the recommendation requestor to gauge the how much confidence the peer itself places in the recommendation it has sent. To minimize the negative influence of unreliable information, the recipient of these recommendations weighs them using this attached level of confidence and the credibility of the sender. If the level of confidence has a small value, the recommendation is considered weak and has less effect on the reputation calculation. Credibility is evaluated according to the past behavior of peers and reflects the confidence a peer has in the received recommendation. Credibility computation is presented in the next subsection. The indirect trust value of peer j according peer i, denoted by
Rijt , is given by the
following formula: t ij
R Where
∑ =
pk ∈Ti
∑
Crikt ∗ CI kjt ∗ Re ckjt
(5)
Crikt ∗ CI kjt p ∈T k
i
Crikt is the credibility of peer k according to peer i, CI kjt denotes the level of
confidence about the recommendation value Re cijt . So peer i gives more weight to recommendations that are considered to be of a high confidence and that come from peers who are more credible. Overall Trust. Base on the direct trust value and indirect trust value calculated t
above, the overall trust value of peer i to peer j’s service (denoted by Oij ) is defined in formula (6).
Oijt = λ ∗ Dijt + (1 − λ ) ∗ Rijt
(6)
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems
377
Where D ijt denotes the direct trust value of i to j, R ijt is the indirect trust of peer j according to peer i, the “self-confidence factor” is denoted by λ , which means that how a peer is confident to its evaluation of direct trust value. λ = Exp (i, j ).n / m ,
Exp (i, j ).n is the number of the direct interactions considered, and m is the maximum number to be considered for a peer, and the upper limit for
λ
is 1.
3.2 Recommendation Credibility Any peer in our system will also maintain a recommendation credibility table for all the other peers it has got recommendations from. Suppose peer i has got some recommendations with peer k during the last D time units, the entry for peer k is denoted as R E C ( i , k ) = n , R E C _ L I S T , where n is the number of credibility ratings, and REC _ LIST is an index in which these credibility ratings are kept. In more detail, after having an interaction with peer j, peer i gives its rating about j’s service performance as V ij . Now, if peer i received recommendation < V k j , C I kj > from peer k, then the credibility rating value V w for peer k about this recommendation is given in the following formula:
Vw = 1 − Vkj − Vij The credibility rating value
(7)
Vw is set to be inversely proportional to the difference
between a witness recommendation value and the actual performance (e.g. higher difference, lower credibility). The rating about peer k’s credibility — r = (i, k, t, V w , C I w ) — is then appended to peer i’s recommendation credibility table. t is the time of peer k providing peer i the recommendations about peer j, C I w = C I kj . The recommendation credibility of peer i to peer j at time t is denoted by C rijt , it is a [0, 1]-valued function which represents the confidence formed by peer i about the truthfulness of j's recommendations. This function is local and is evaluated on the recent past behavior of both peer i and j. It is locally used to prevent a false credibility from being propagated within the network. The credibility trust value
Crikt is
calculated as follows: ⎧∑e∈Rec(i,k).REC _ LIST e.Vw ∗αt−e.t ∗eCI . w ⎪ if Rec(i, k).REC _ LIST ≠∅ t t −e.ts Crik = ⎨ ∑e∈Rec(i,k ).REC_ LIST α ∗eCI . w ⎪ otherwise ⎩c0
(8)
Where α is the decay factor in the range of (0, 1) using equation (1), So, it can cope with more complex malicious strategies, for example, peers could try to gain trust from others by telling the truth over a sustained period of time and only then start lying. e .C I w is the level of confidence about the recommendation value, and
378
e .V
J. Chang et al.
w
is the value of the credibility rating e. If no such ratings has been recorded, we
will assign the default credibility trust value, denoted by
c0 , to peer j.
Our credibility model considers the level of confidence about the recommendation value. Giving incorrect recommendation can decrease the recommendation credibility of a peer. So, a peer can lower the level of confidence for opinions about which it is not very sure, therefore risking less loss of credibility in case its judgment is incorrect. If a weak recommendation is inaccurate, the recommendation credibility does not diminish quickly. A peer can not be penalized as much for an incorrect opinion that was based on a small number of interactions and/or a large variation in experience. Then, honest peers will not be falsely classified as lying. 3.3 Simple Trust Information Exchange Protocol The efficiency of the reputation mechanism fully depends on the number of received recommendations and the quality of each of them. In our reputation mechanism, incentive for participation and honest behavior is implemented through a fair differential service mechanism. The goal of service differentiation is not to provide hard guarantees but to create a distinction among the peers based on their contributions to the system. The basic idea is, the more the contribution, the better the relative service. In order to achieve this goal, first, we define two parameters that can be used to create service differentiation in trust information exchange, namely, level of participation, measuring if a peer is active in providing recommendations, and the recommendation credibility defined in section 3.2, assessing if a peer is providing honest recommendations. Second, based on the above two parameters, we propose a simple yet effective trust information exchange protocol using a “tit-for-tat” strategy, to elicit sufficient and honest participation. Based on the rating discovery algorithm proposed in [6], our protocol makes some changes to implement fair service differentiation. We introduce the level of participation notion as the propensity of a peer for replying to a rating request. It is described by function
lijt such that lijt represents the
percentage of times j provided its recommendation to i's queries regarding other peers over the last D time units, with
lij0 = 1. We use a simple approach to calculate lijt ,
which is calculated based on the number of recommendations provided by peer j to i during the last D time units. As the number of these recommendations (retrieved from the recommendation credibility table and denoted by
I ijt ) grows, the participation
level increases until it reaches a defined threshold (denoted by ⎧ I ijt ⎪ l ijt = ⎨ I m a x ⎪1 ⎩
if
I ijt ≤ I m a x
I m ax ). (9)
e lse
Finally, to elicit sufficient and honest participation, we apply the tit-for-tat strategy during the collect phase, i.e., upon receipt of a rating request from peer j, with
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems t
379
t
probability min( lij , Crij ) peer i provides its recommendation, otherwise it ignores the request. The more details are described in algorithm 1. Consequently, by not participating, requesting peers drive correct witnesses ignoring the request, which clearly makes their reputation mechanism useless. Hence there is a clear incentive for non participating peers to change their behavior. As for participating peers, when peer p receives a request from a requesting peer j then i satisfies j's request with probability
Crijt . By doing so, i satisfies j's request if it estimates that j is trustworthy, otherwise it notifies j of its recent faulty behavior by simply ignores the request. As previously, by cheating, a malicious peer penalizes itself by pushing correct witnesses to ignore its request, leading to its effective isolation. We claim that this social exclusion-based strategy motivates j to reliably cooperate.
Algorithm 1: Trust Information Exchange Protocol 1: upon (receipt of a rw(j,s,ttl,t) message at peer i) do 2: with (probability min( l itj , C rijt )) do 3:
if (i has interacted with s in the last D time units)
4:
recist ⇐〈 Dist , ρist 〉;
5:
send
recist to j;
6: else 7: ignore message; 8: end if 9: end do 10: if(ttl≠0) 11: A ⇐ getRandomNeighbor(b); // b is branching factor; 12: For each peer k in A do 13: send a rw(j,s,ttl−1,t) message to peer k; 14: send a witness(s,k,t) to j; 15: end do 16: end if 17: end do
4 Experimental Evaluation We will now evaluate the effectiveness of ICRep by means of experiments. Our intention with this section is to confirm that ICRep is robust against the collusion and badmouthing attacks, that it can effectively detect dishonest recommendations in a variety of scenarios where more complex malicious strategies are introduced, and that it is incentive compatible.
380
J. Chang et al.
4.1 Simulation Setup In our simulation, we use the topology of the system and the deception models as Bin Yu’s reputation mechanism [6]. In order to empirically evaluate our new reputation mechanism against more complex strategies, we make some changes. In our simulation experiment, the quality for a peer to be a SP (service provider) is independent of the quality for a peer to be a rater which provides recommendation. We first define the types of qualities of both SPs and raters used in our evaluation. Three types of behavior patterns of SPs are studied: good peers, fixed malicious peers and dynamic malicious peers. Good peers and fixed malicious peers provide good services and bad services without changing their qualities once the simulation starts respectively. Dynamic malicious peers alter their behavior strategically. The behaviors of peers as raters can be one of the three types: honest peers, fixed dishonest peers and dynamic dishonest peers. Honest and fixed dishonest peers provide correct and incorrect feedback without changing their patterns respectively. Dynamic dishonest peers provide correct feedback strategically, for example, the dynamic dishonest peers which tell the truth over a sustained period of time and only then start lying. Our initial simulated community consists of N peers, N is set to be 128. The percentage of the bad SPs is denoted by pb , the percentage of the bad raters is denoted by pf . Table 1 summarizes the main parameters related to the community setting and trust computation. The default values for most experiments are listed. In the default setting, 50% malicious peers are fixed malicious service providers, 50% malicious peers are dynamic malicious service providers, with 50% probability giving service maliciously. The dishonest raters are fixed dishonest peers which give complementary rating and the level of confidence is set to 1. We divide the total simulation time into multiple simulation time units. In every time unit, each peer initiates a single request that can be satisfied by any of the potential service providers. Every peer issues one service request per simulation round. When a peer receives a Table 1. Simulation Parameters Parameter N pb
Description number of peers in the community percentage of malicious peers in the community
Default 128 80%
pf
percentage of dishonest raters in the community probability peer responds to a service request
80% 0.1
λ α
self-confidence factor
Dynamic
the decay factor
0.9
c0 TTL B
initial credibility value
0.5
bound of the referral chain’s length branching factor in rating discovery algorithm exaggeration factor sliding time window the threshold number of recommendations the threshold number of interactions in formula (2)
4 2 1 10 20 5
res
ρ
D Imax M
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems
381
query, it answers it with res probability, or refers to other peers. res is set to 0.1 in the experiments. Two transaction settings are simulated, namely random setting and trusted setting. In random setting, peers randomly pick a peer from candidate peers who answer the service request to perform transactions. In trusted setting, peers select the reputable peer who has the maximal reputation value. The simulation program has been implemented in Java programming language. 4.2 Effectiveness of the Reputation Mechanism This simulation evaluates the immunity of ICRep reputation mechanism to the collusion and badmouthing attacks. This set of experiments demonstrates the benefit of reputation mechanism we proposed, peers compare the trustworthiness of peers and choose the peer with the highest trust value to interact with. A transaction is considered successful if both of the participating peers are good peers, otherwise is a failure transaction. We define successful transaction rate as the ratio of the number of successful transactions over the total number of transactions in the community up to a certain time. A community with a higher transactions success rate has a higher productivity and a stronger level of security. The experiment is performed in both non-collusive setting and collusive setting. We show the benefit of our reputation mechanism compared to a community without any trust scheme. We also compare the performance of our scheme against the trust management scheme proposed by Bin Yu in [6]. 1.4
1 0.8 0.6 0.4
1 0.8 0.6 0.4 0.2
0.2 0
Our reputation mechanism Bin Yu's scheme Without reputation mechanism
1.2 transaction success ratio
1.2 transaction success ratio
1.4
Our reputation mechanism Bin Yu's scheme Without reputation mechanism
0
2
4
6
8 10 time unit
12
14
(a) no-collusive setting
16
0
0
2
4
6
8 10 time unit
12
14
16
(b) collusive setting
Fig. 1. Effectiveness against the collusion and badmouthing attacks
Figure 1 shows the rate of success transactions with respect to the number of time units in collusive and non-collusive setting. We can see an obvious gain of the transaction success rate in communities equipped with a trust mechanism either in non-collusive setting or in collusive setting. Both ICRep and Bin Yu’s scheme can help peers avoid having interactions with malicious service providers in both settings, malicious peers are effectively identified even when they launch a collusion attack. This confirms that supporting trust is an important feature in a P2P community as peers are able to avoid untrustworthy peers. While in the collusive setting, dishonest peers’ collusive behaviors hardly disturb honest peers’ judgment. It needs more
382
J. Chang et al.
interactions to differentiate good peers from bad peers. Moreover, it is observed that Bin Yu’s scheme needs more interactions to differentiate good peers from bad peers in both setting, so ICRep outperforms Bin Yu’s reputation mechanism. 4.3 Predicting Honesty We now define a useful metric to evaluate the performance of our proposed recommendation credibility model. Definition 1. The average recommendation credibility of a witness C re j = 1
W j is
N
N
∑
i =1
(9)
C ri j
Where C rij is the credibility value of witness W j from peer Pi’s acquaintance model [6], and N is the number of peers in whose acquaintance model W j occurs. • Sensitiveness to Strategically Alter Behaviors of Peers The goal of this experiment is to show how credibility model we proposed works against strategic dynamic personality of peers. We simulated a community with all good peers but a dynamic malicious rater with dynamic personality. We simulated two changing patterns. First, peer could try to gain trust from others by telling the truth over a sustained period of time and only then start lying. Second, the peer is trying to improve its recommendation trust by telling the truth. Figure 2 illustrates the changes of recommendation trust of both the peer who is milking its recommendation trust and the peer who is building its recommendation trust in the whole process. The results indicate that our reputation mechanisms is very sensitive to peers’ strategically alter behaviors. 0.9
Milking Recommendation Trust Building Recommendation Trust
1 recommendation credibility
re co m m e n d a tio n cre d ib ility
1.2
0.8 0.6 0.4 0.2 0
exaggerated coefficient = 0.2 exaggerated coefficient = 0.3 exaggerated coefficient = 0.5
0.8
0.7
0.6
0.5
0.4
0
5
10
15 time unit
20
25
30
Fig. 2. Sensitiveness to strategically behaviors of peers
0
5
10
15 20 time unit
25
30
35
Fig. 3. Average recommendation credibility of alter witnesses for different exaggeration coefficients
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems
383
• Effects of Exaggeration Coefficient The present experiment studies the average recommendation credibility for such witnesses with different exaggeration coefficients [6]. Figure 3 shows the average recommendation credibility for witnesses with exaggerated negative ratings when exaggeration coefficient ρ is set to 0.2, 0.3, and 0.5, respectively. The results indicate that our approach can effectively detect witnesses lying to different degrees. For the witnesses with exaggerated negative ratings, their average recommendation credibility reaches to about 0.8, 0.7, and 0.5, respectively, after 10 time unit. So, the marginal lying cases can be detected. • Impact of Level of Confidence In the above two experiments, we only considered peers providing service with fixed personality. This experiment considers dynamic attack. An attacker, with x% probability, behaves maliciously by giving malicious service. In the other times, it behaves as a good peer. In this experiment, 80% peers are dynamic attackers with 50% probability giving service maliciously, other peers are good peer, and all the peers provide honest recommendations. The recommendation trust metrics has been observed to understand if honest peers assign fair recommendation trust values to each other. 0.9 recommendation credibility
0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5
0
5
10
without level of confidence with level of confidence 15 20 25 30 35 time unit
40
Fig. 4. Impact of level of confidence
Figure 4 shows the recommendation trust of honest peers in this setting, we can conclude that: without level of confidence, a peer may be penalized for an incorrect opinion that was based on a large variation in experience. Our approach allows a peer to determine the level of confidence about its recommendation. Giving incorrect recommendation can decrease the credibility of a peer. So, a peer can lower the level of confidence for opinions about which it is not very sure, therefore risking less loss of credibility in case its judgment is incorrect. If a weak recommendation is inaccurate, the recommendation credibility does not diminish quickly. A peer can not be penalized as much for an incorrect opinion that was based on a small number of interactions and/or a large variation in experience. Then, honest peers will not be falsely classified as lying.
384
J. Chang et al.
4.4 Effectiveness of Incentive Mechanism Since we have showed that our credibility model can help peers effectively detect inaccurate recommendations and generate a fair evaluation of recommendation in a variety of scenarios, we focus on the effectiveness of our incentive mechanism in stimulating active and truthful recommendations. We investigate and compare the performance of the 4 different types of recommenders similar as [20]: active truth-teller, inactive truth-teller, active liar and inactive liar. Each type of entity has the same population, i.e., 32 each. Honest recommenders recommend with their direct trust regarding the trustee, while dishonest recommenders send back lies which are complementary to their direct trust and level of confidence are set to 1. Active recommenders offer recommendations with 90% probability, while inactive ones offer with 10% probability. An active truth-teller can elicit more honest recommendations, which help him make right trust decisions regarding whether to interact with a peer or not. Therefore, first, we show the number of honest recommendations obtained by the four types of recommenders respectively. Second, we also display the number of wrong trust decisions made by different recommenders. When a peer fails to acquire any helpful recommendation, it has to base its trust decision solely on its direct experiences, which are not significant enough for a sound decision. Namely, the peer is subject to wrong trust decisions, which refer to either false positives (when an honest service provider is identified as an untrustworthy one) or false negatives (when a dishonest service provider is not identified as being so). • Elicited Honest Recommendations Figure 5 shows the number of elicited honest recommendations for different type of recommenders. We can see that at the beginning, very few recommendations are propagated and the four types of recommenders do not have much difference in the number of obtained honest recommendations. With the accumulation of experiences, the honest peers have enough experiences to recommend. Recommendation credibility is gradually recognized and the order of benefit (AT > IT > IL > AL) starts to be established, from simulation round 5 in Figure 5. 60
800
AT recommendations IT recommendations 700 AL recommendations IL recommendations
AT wrong trust decisions IT wrong trust decisions AL wrong trust decisions IL wrong trust decisions
50
600
number
number
40 500
400
30 20
300
10
200
0
100 1
2
3
4
5 time unit
6
7
8
9
1
2
3
4 5 time unit
6
7
8
Fig. 5. Number of elicited honest recommendations Fig. 6. Number of wrong trust decisions
ICRep: An Incentive Compatible Reputation Mechanism for P2P Systems
385
• Wrong Trust Decisions Figure 6 presents the number of wrong trust decisions made by the four types of recommenders. It can be seen that, with more accumulated experiences, every type of recommenders make less and less wrong trust decisions. Especially, with the help of honest recommendations, AT peers make the least number of wrong trust decisions and AL peers make the most (the order of AT > IT > IL > AL is enforced). Note that dishonest or inactive recommenders can also tell the honesty and activeness of a recommender using reputation system. However, they have access to less number of truthful recommendations for making the right trust decision. From these two experiments, we can get a conclusion that ICRep provides peers with the right incentives for truthful reporting of feedback information, as sincere peers receive always more benefit from the peer-to-peer system than liar peers, whose benefit is very low. Thus, the incentive mechanism is effective.
5 Conclusions and Future Work We present ICRep: an incentive compatible reputation system for P2P systems. Within this system a peer can reason about trustworthiness of other peers based on the available local information which includes past interactions and recommendations received from others. We focus on how to stimulate reputation information sharing and honest recommendation elicitation. Theoretic analysis and simulation show that ICRep can help peers effectively detect dishonest recommendations in a variety of scenarios where more complex malicious strategies are introduced. the precision of inaccuracy detection is improved (e.g. more marginal lying cases can be detected, and honest witnesses will not be falsely classified as lying because of an increased fluctuation in a provider’s performance). Moreover, it can also stimulate peers to send sufficiently honest recommendations. The latter is realized by ensuring that active and honest recommenders, compared to inactive or dishonest ones, can receive always more benefit from the peer-to-peer system. Interactions (service providing and recommendation providing) that occur within a sliding window of width D are considered, therefore, storage requirements for storing trust information are tolerable. The main overhead of our reputation mechanism comes from the reputation queries. In a service session, one provider is selected but reputation values about other providers are deleted. Thus, reputation values about unselected providers can be cached. Since a peer obtains more acquaintances with time, number of cache entries and cache hit ratio increase with time. By this way, we can reduce the overhead of our reputation mechanism comes from the reputation queries. As a next step, we will be evaluating our reputation mechanism as applied to a peer-to-peer network.
References 1. Oram, A.: Peer to Peer: Harnessing the power of disruptive technologies (2001) ISBN 0596-00110-X 2. Aberer, K., Despotovic, Z.: Managing Trust in a Peer-2-Peer Information System. In: The Proceedings of Intl. Conf. on Information and Knowledge Management (2001)
386
J. Chang et al.
3. Kamwar, S.D., Schlosser, M.T., Garcia-Molina, H.: The Eigentrust Algorithm for Reputation Management in P2P Networks. In: The Proceedings of the twelfth international conference on World Wide Web, Budapest, Hungary (2003) 4. Damiani, E., De Capitani di Vimercati, S., Paraboschi, S., Samarati, P.: Managing and sharing servents’ reputations in p2p systems. IEEE Transactions on Data and Knowledge Engineering 15(4), 840–854 (2003) 5. Xiong, L., Liu, L.: PeerTrust: Supporting reputation-based trust in peer-to-peer communities. IEEE Transactions on Data and Knowledge Engineering, Special Issue on Peer-to-Peer Based Data Management 16(7), 843–857 (2004) 6. Yu, B., Singh, M.P., Sycara, K.: Developing trust in large-scale peer-to-peer systems. In: Proceedings of First IEEE Symposium on Multi-Agent Security and Survivability (2004) 7. Chang, J., Wang, H., Yin, G.: A Time-Frame Based Trust Model for P2P Systems. In: Proceedings of 9th International Conference on Information Security Cryptology, Seoul, Korea (2006) 8. Jurca, R., Faltings, B.: An Incentive Compatible Reputation Mechanism. In: Proc. of the IEEE Conference on Electronic Commerce, Newport Beach, CA, USA (June 2003) 9. Lam, S.K., Riedl, J.: Shilling recommender systems for fun and profit. In: Proceedings of the 13th World Wide Web Conference (2004) 10. Srivatsa, M., Xiong, L., Liu, L.: TrustGuard: countering vulnerabilities in reputation management for decentralized overlay networks. In: WWW 2005, pp. 422–431 (2005) 11. Withby, A., Jøsang, A., Indulska, J.: Filtering Out Unfair Ratings in Bayesian Reputation Systems. In: Proceedings of the 7th Int. Workshop on Trust in Agent Societies (at AAMAS 2004), ACM, New York (2004) 12. Dellarocas, C.: Immunizing Online Reputation Reporting Systems Against Unfair Ratings and Discriminatory Behavior. In: ACM Conference on Electronic Commerce, pp. 150–157 (2000) 13. Chen, M., Singh, J.: Computing and Using Reputations for Internet Ratings. In: Proceedings of the Third ACM Conference on Electronic Commerce (EC 2001), ACM, New York (2001) 14. Feng, Z.J., Xian, T., Feng, G.J.: An Optimized Collaborative Filtering Recommendation Algorithm. Journal of Computer Research and Development 41(10) (2004) (in Chinese) 15. Dellarocas, C.: Reputation mechanisms. In: Hendershott, T. (ed.) Handbook on Information Systems and Economics, Elsevier, Amsterdam (2006) 16. Fernandes, A., Kotsovinos, E., Ostring, S., Dragovic, B.: Pinocchio: Incentives for honest participation in distributed trust management. In: Jensen, C., Poslad, S., Dimitrakos, T. (eds.) iTrust 2004. LNCS, vol. 2995, Springer, Heidelberg (2004) 17. Papaioannou, T.G., Stamoulis, G.D.: An Incentives’ Mechanism Promoting Truthful Feedback in Peer-to-Peer Systems. In: Proceedings of the 5th IEEE/ACM International Symposium in Cluster Computing and the Grid, Cardi, UK (2005) 18. Dellarocas, C.: Reputation mechanisms. In: Hendershott, T. (ed.) Handbook on Information Systems and Economics, Elsevier, Amsterdam (2006) 19. Beth, T., Borcherding, M., Klein, B.: Valuation of Trust in Open Networks. In: Gollmann, D. (ed.) Computer Security - ESORICS 1994. LNCS, vol. 875, Springer, Heidelberg (1994) 20. Liu, J., Issarny, V.: An incentive compatible reputation mechanism for ubiquitous computing environments. In: PST 2006. International Conference on Privacy, Security and Trust, Toronto, Canada (2006) 21. Huynh, T.D., Jennings, N.R., Shadbolt, N.: On handling inaccurate witness reports. In: Proc. 8th International Workshop on Trust in Agent Societies, Utrecht, The Netherlands, pp. 63–77 (2005)
Author Index
Aickelin, Uwe
157
Chan, Herwin 102 Chang, Junsheng 371 Choi, Myeonggil 359 Choo, Kim-Kwang Raymond Eckert, Claudia 142 Elliott, Stephen J. 48 Feyereisl, Jan
157
Goi, Bok-Min Gu, Yuan X.
245 61
Lo, Swee-Won Lu, Zhengding
30
Main, Alec 61 Marnane, William P. 317 Matsumoto, Tsutomu 128 McEvoy, Robert 317 Moon, SangJae 333 Mu, Yi 16 Murphy, Colin C. 317 Okamoto, Eiji
Ha, JaeCheol 333 Han, Lilong 266 Hashimoto, Toru 173 Hatano, Yasuo 215 Huang, Hao 291 Huang, Wei 291 Huang, Xinyi 16 Imai, Hideki 173 Itoh, Takashi 173 Iwai, Hisato 173 Jang, Jihyeon 48 Johnson, Harold 61 Kaneko, Toshinobu 215 Kawahara, Yuto 1 Kim, Hakil 48 Kim, Hyoung-Joong 76 Kim, Jeong-Nyeo 91 Kiyomoto, Shinsaku 203 Kobara, Kazukuni 173 Komoda, Norihisa 128 Lee, Byoungcheon 30 Lee, ChoelHoon 91 Lee, Yong Ki 102, 115 Li, Ruixuan 277 Lim, Jae-Deok 91 Liu, Qingtan 266
245 277
1
Park, JeaHoon 333 Phan, Raphael C.-W. 245 Prouff, Emmanuel 227 Pyshkin, Andrei 188 Rivain, Matthieu 227 R¨ oder, Patrick 142 Saisho, Hideaki 128 Sameshima, Yoshiki 128 Sasaoka, Hideichi 173 Shin, Sangmun 359 Shirase, Masaaki 1 Stumpf, Frederic 142 Sugio, Nobuyuki 215 Susilo, Willy 16 Takagi, Tsuyoshi 1, 203 Tanaka, Hidema 215 Tanaka, Toshiaki 203 Tang, Yangbin 371 Tang, Zhuo 277 Tews, Erik 188 Tunstall, Michael 317 Ueba, Masazumi Un, Sung-Kyong
173 91
Verbauwhede, Ingrid Walter, Colin D. Wang, Huaimin
303 371
102, 115
388
Author Index
Weinmann, Ralf-Philipp 188 Wen, Zhumu 277 Wilson, William O. 157 Wu, Wei 16 Xia, Lei 291 Xiang, Shijun 76 Xu, Shenkun 345
Yang, Jeongmo 30 Yang, Zongkai 266 Ye, Xiaojun 345 Yen, SungMing 333 Yin, Gang 371 Yoo, Seungjae 30 Yoshitomi, Motoi 203 Zhou, Yongxin
61