SHA-3 proposal BLAKE
BLAKE is one of the five hash functions in the final of the NIST SHA-3 Competition. BLAKE is one of the simplest designs to implement, and relies on previously analyzed components: the HAIFA structure and the ChaCha core function.
The two main instances of BLAKE are BLAKE-256 and BLAKE-512. They respectively work with 32- and 64-bit words, and produce 256- and 512-bit digests.
BLAKE has both a high security margin and a high performance versatility:
- On an Intel Core i5-2400M (Sandy Bridge), BLAKE-256 can hash at 7.49 cycles/byte and BLAKE-512 at 5.64 cycles/byte (details).
- On an AMD FX-8120 (Bulldozer), BLAKE-256 can hash at 11.83 cycles/byte and BLAKE-512 at 6.88 cycles/byte (details).
- On a Cortex-M3 based microcontroller (32-bit processor), BLAKE-256 can be implemented with 280 bytes of RAM and 1320 bytes of ROM, and BLAKE-512 with 516 bytes of RAM and 1776 bytes of ROM (details).
- On an ATmega1284P microcontroller (8-bit processor), BLAKE-256 can be implemented with 267 bytes of RAM and 3434 bytes of ROM, and BLAKE-512 with 525 bytes of RAM and 6350 bytes of ROM (details).
- On a Xilinx Virtex 5 FPGA, BLAKE-256 implemented with 56 slices can reach a throughput of more than 160 Mbps, and BLAKE-512 with 108 slices can reach a throughput of more than 270 Mbps (details).
- In 180nm ASIC, BLAKE-256 can be implemented with 13.5 kGE. In 90nm ASIC, BLAKE-256 implemented with 38 kGE can reach a throughput of more than 10 Gbps, and BLAKE-512 with 79 kGE can reach a throughput of more than 15 Gbps (details).
BLAKE was designed by
- Jean-Philippe Aumasson (Kudelski Security, Switzerland)
- Luca Henzen (then ETHZ, Switzerland; now UBS, Switzerland)
- Willi Meier (FHNW, Switzerland)
- Raphael C.-W. Phan (Loughborough University, UK)
Final BLAKE
Initially, the BLAKE functions were named BLAKE-28, BLAKE-32, BLAKE-48, and BLAKE-64. In December 2010, a final BLAKE version was announced, as allowed by NIST, and functions are now renamed as BLAKE-224, BLAKE-256, BLAKE-384, and BLAKE-512. The rename is to distinguish the final BLAKE from its initial version.The final BLAKE consists in an increased number of rounds: 14 instead of 10 for BLAKE-224 and BLAKE-256, and 16 instead of 14 for BLAKE-384 and BLAKE-512. This is motivated by the high speed of BLAKE, and thus it is possible to choose a very conservative security margin in the final version in such a way that BLAKE remains faster than SHA-2 on a number of platforms.
Downloads
- Documentation, including specification, implementation report, preliminary analysis
- Toy versions BLOKE, FLAKE, BLAZE, and BRAKE
- Slides of the presentation of BLAKE at the First SHA-3 Conference
- Slides of the presentation of BLAKE at the Second SHA-3 Conference
- Slides of the presentation of BLAKE at the Third SHA-3 Conference
- Slides of the presentation "Quo vadis BLAKE?" at the 2011 "Quo Vadis Cryptology?" workshop
- Reference C implementations:
- blake_c.tar.gz: C implementations with command-line interface to hash files, simpler and shorter code than the NIST reference (also on GitHub)
- blake_ref.c, blake_ref.h: reference implementation for NIST's API (2015.09.07: fixed a bug that gave incorrect hashes in specific use cases)
- Reference VHDL implementations:
- blake_vhdl_v2.tar.gz: reference implementations, with four different architectures
- http://www.iis.ee.ethz.ch/~sha3/blake/: speed-optimized implementations (as described in HAMP10.pdf)
- compact_blake256_vhdl.tar.gz: low-area implementation of BLAKE-256 (as described in HAMP10.pdf)
- blakechip.jpg: picture of the chip containing our 13.5 kGE implementation of the full BLAKE-256
- The BLAKE building (Washington, DC)
Cryptanalysis
Some of The results below were presented for the initial version of BLAKE, but apply as well to final BLAKE.- 2011 Nov 18: Donghoon Chang, Mridul Nandi, Moti Yung.
Indifferentiability of
the hash algorithm BLAKE. IACR ePrint archive, report 2011/620
Main result: proof of indifferentiability
- 2011 Nov 17: Elena Andreeva, Atul Luykx, Bart Mennink.
Provable
security of BLAKE with non-ideal compression function. Third
SHA-3 Conference, IACR ePrint archive, report 2011/620
Main result: proof of indifferentiability
- 2011 May 19: Orr Dunkelman, Dmitry Khovratovich.
Iterative
differentials, symmetries, and message modification in
BLAKE-256. ECRYPT2 Hash Workshop 2011
Main result: distinguisher for the permutation of BLAKE-256 reduced to 6 middle rounds, with complexity 2456
- 2011 May 12: JPA, Gaëtan Leurent, Willi Meier, Florian Mendel,
Nicky Mouha, Raphael C.-W. Phan, Yu Sasaki, Petr Susil.
Tuple cryptanalysis of ARX
with application to BLAKE and Skein. ECRYPT2 Hash Workshop 2011
Main result: distinguisher for the permutation of BLAKE-256 reduced to 4 middle rounds, with complexity 264
- 2011 Mars 23: Dmitry Khovratovich, Gaëtan Leurent, María Naya-Plasencia.
Observations on Blake (slides). Technical report
Main result: conjectured distinguisher for 10 rounds of the permutation
- 2011 Feb 15: Alex Biryukov, Ivica Nikolic, Arnab Roy. Boomerang
attacks on BLAKE-32 (slides). FSE 2011
Main result: distinguishers for the compression function (resp. permutation) of BLAKE-256 reduced to 7 (resp. 8) rounds, with complexity 2232 (resp. 2242)
- 2010 Dec 17: Mao Ming, He Qiang, Shaokun Zeng. Security
analysis of BLAKE-32 based on differential properties
(abstract). ICCIS 2010
Main result: analysis of differential properties, and evidence that the attack considered is inapplicable to 6 rounds of BLAKE-256
- 2010 Aug 23: Meltem Sönmez Turan, Erdener Uyan. Practical
near-collisions for reduced round Blake, Fugue, Hamsi and JH
. Second SHA-3 Conference
Main result: near-collision attacks on resp. 209 and 184 bits for the compression function of BLAKE-256 reduced to resp. 1.5 and 2 rounds, with complexity 226
- 2010 Jul 1: Janoš Vidali, Peter Nose, Enes
Pašalic. Collisions
for variants of the BLAKE hash function
. Information Processing Letters, volume 110, issues 14-15
Main result: efficient collision attacks for the toy version BLOKE, and for the compression function of the toy version BRAKE
- 2010 Jun 18: Bozhan Su, Wenling Wu, Shuang Wu, Le
Dong. Near collisions on
the reduced-round compression functions of Skein and
BLAKE. IACR ePrint archive, report 2010/355
Main result: near-collision attacks on resp. 152, 396, and 306 bits for the compression function of BLAKE-256, -512, -512 reduced to 4, 4, 5 middle rounds with complexity 221, 216, and 2216
- 2010 Jan 29: Jean-Philippe Aumasson, Jian Guo, Simon Knellwolf, Krystian Matusiewicz, Willi Meier.
Differential and invertibility
properties of BLAKE. FSE 2010. IACR ePrint archive, report 2010/043
Main result: proof that one round is a permutation of the message, for a fixed state; improved preimage attack on 1.5 rounds; impossible differentials for the permutation with 5 (resp. 6) rounds for BLAKE-256 (resp. BLAKE-512)
- 2009 Dec 7: Lei Wang, Kazuo Ohta, Kazuo
Sakiyama. Free-start
preimages of step-reduced Blake compression function. Rump
session of ASIACRYPT 2009
Main result: preimage attacks for the permutation of BLAKE-256 reduced to 4.5 rounds and followed by the finalization, with complexity 2252 and memory 28
- 2009 Jun 23: Jian Guo, Krystian Matusiewicz. Round-reduced
near-collisions of BLAKE-32. WEWoRC 2009
Main result: near-collision attack on 232 bits for the compression of BLAKE-256 reduced to 4 middle rounds (rounds 3 to 6), with complexity 256; uses differences in the chaining value, the salt, the counter, and the message
- 2009 May 26: Li Ji, Xu
Liangyu. Attacks on
round-reduced BLAKE. IACR ePrint archive, report 2009/238
Main result: collision and preimage attacks for BLAKE with compression function reduced to 2.5 rounds. Respectively for BLAKE-224, -256, -384, and -512, collision attacks have complexities 296, 2112, 2160, and 2224; preimage attacks have complexities 2209, 2241, 2355, and 2481
Software implementations
Some of the performance results below were conducted on the initial version of BLAKE, thus the speed figures do not apply to final BLAKE, but the memory estimates (ROM and RAM) are the same.Speed measurements on various software platforms can be found on eBASH and on XBX.
The latest versions of the fastest C and assembly implementations can be found in the latest release of SUPERCOP.
- 2012 Jul 24: Dmitry Chestnykh.
dart-blake.
Main result: Dart implementation of BLAKE-256 - 2012 May 16: Samuel Neves, Jean-Philippe
Aumasson.
Implementing BLAKE with AVX, AVX2, and XOP.
Main result: extended version of the SHA-3 Conference paper with refined analysis of AVX2 and XOP implementations - 2012 Apr 3: Samuel Neves, Jean-Philippe
Aumasson. BLAKE
and 256-bit advanced vector extensions. Third SHA-3 Conference
Main result: implementations using AVX, XOP (available in SUPERCOP), and AVX2 extensions (available here) - 2012 Feb 29: Mark
Rhodes. Blake-512
in Javascript.
Main result: Javascript implementation of BLAKE-512 - 2012 Jan 8: Christian
Wenzel-Benner. arm_thumb2. (link
to SUPERCOP)
Main result: port of the arm11 implementation to Thumb-2 instruction set (as required by ARM cores such as the Cortex-M3). - 2012 Jan 3: David
Lazar. HMAC mode for
BLAKE.
Main result: C implementation of HMAC for all instances of BLAKE - 2011 Nov 21: Peter Schwabe, Bo-Yin Yang, Shang-Yi Yang. arm11. (link to SUPERCOP)
Main result: assembly implementation of BLAKE-256 for ARM11 architecture - 2011 Nov 21: Ingo von
Maurich. Blake256-AVR-asm.
Main result: assembly implementation of BLAKE-256 for 8-bit AVR ATmega microcontrollers, using 251 bytes of RAM and running at 456 cycles/byte - 2011 Nov 21: Dominik
Reichl. BlakeSharp.
Main result: C# implementations of BLAKE-256 and BLAKE-512 (.NET and Mono compatible) - 2011 Nov 15: Dmitry
Chestnykh. blake256.
Main result: Go implementation of BLAKE-256 - 2011 Nov 14: Marc Greim. blake-512-java-implementation.
Main result: Java implementation of BLAKE-512 - 2011 Nov 14: Kevin
Cantu. Haskell-BLAKE.
Main result: Haskell implementation of BLAKE - 2011 Aug 12: Gaëtan Leurent. Vectorized BLAKE implementations. (link to SUPERCOP)
Main result: C implementations of BLAKE-256 and BLAKE-512 exploiting the SSSE3 extensions and ARM's NEON extensions - 2011 May 31: Thomas Burgess, Joseph Jelley, David Smith, Claire
Weston. BLAKE256_matlab.zip.
Main result: MATLAB implementation of BLAKE-256, non-object-oriented - 2011 May 26: Zeke Steer. Blake_256.m.
Main result: MATLAB implementation of BLAKE-256, object-oriented (test program) - 2011 May 12: Larry
Bugbee. blake.py.
Main result: Python (2 and 3) implementations of BLAKE - 2011 Jan 27: Daniel
Correa. blakehash-php.
Main result: PHP extension implementing BLAKE - 2010 Dec 14:
Gray. Digest::BLAKE.
Main result: Perl interface to BLAKE - 2010 Aug 19: Joppe W. Bos and Deian
Stefan. Performance
analysis of the SHA-3 candidates on exotic multi-core
architectures. CHES 2010
Main result: parallel implementation of BLAKE-32 on a Cell Broadband Engine (processor for Sony PS3) running at 5 cycles/byte and on NVIDIA GTX 295 GPU at 0.27 cycles/byte - 2010 May 11: Thomas
Pornin. sphlib.
Main result: C and Java implementation of BLAKE-256 and BLAKE-512 in the sphlib library and speed measurements on various platforms - 2010 May 10: Christopher
Drost. sha3-js.
Main result: Javascript implementation of BLAKE-32 (see also the online demo) - 2009 Oct 7: Samuel
Neves. ChaCha
implementation.
Main result: C implementations of BLAKE-32 and BLAKE-64 optimized for Intel Core 2 and i7 processors using SSSE3 extensions; on a Core 2 E8400, measured speed-up from 10.34 to 9.05 cycles/byte for BLAKE-32, and from 13.65 to 11.80 for BLAKE-64 - 2009 May 29: Kota Ideguchi, Toru Owada, Hirotaka
Yoshida. A
study on RAM requirements of various SHA-3 candidates on low-cost
8-bit CPUs. IACR ePrint archive, report 2009/260
Main result: estimates RAM requirements of BLAKE-32 on "low-bit 8-bit CPUs" to 96 bytes - 2009 May 25: Daniel
Otte. AVR-Crypto-Lib/en.
Main result: C implementations of BLAKE on AVR microcontroller, running at 1115 cycles/byte for BLAKE-28 and -32, and 3989 cycles/byte for BLAKE-48 and -64
Hardware implementations
Some of the performance results below were conducted on the initial version of BLAKE, thus the throughput figures do not apply to final BLAKE, but the area estimates (gate-equivalent, slices) are the same.- 2012 Mar 13: Jens-Peter Kaps, Panasayya Yalla, Kishore Kumar
Surapathi, Bilal Habib, Susheel Vadlamudi, Smriti Gurung, John Pham.
Lightweight
implementations of SHA-3 candidates on FPGAs. INDOCRYPT 2011
Main result: lightweight implementation of BLAKE-256 on Spartan 3, Virtex 5, Virtex 6, and Cyclone II FPGA devices
- 2012 Jan 23: Xu Guo, Meeta Srivastav, Sinan Huang, Michael
B. Henry, Leyla Nazhandali, Patrick Schaumont.
ASIC
implementations of five SHA-3 finalists. DATE 2012
Main result: implementation of BLAKE-256 on 130 nm ASIC
- 2011 Sep 26: Olakunle Esuruoso.
High
Speed FPGA Implementation of Cryptographic Hash
Function. Master thesis, U Windsor, Canada
Main result: implementation of BLAKE-256 on Cyclone II FPGA device using Altera's Nios II build tools to save computation by memorizing frequently hashed prefixes
- 2011 May 19: Ekawat Homsirikamol, Marcin Rogawski, Kris Gaj.
Comparing
hardware performance of round 3 SHA-3 candidates using multiple
hardware architecture in Xilinx and Altera FPGAs. ECRYPT2 Hash
Workshop 2011
Main result: implementations of BLAKE-256 and BLAKE-512 on Virtex 5, Virtex 6, Stratix III, and Stratix IV FPGA devices
- 2011 May 19: Malik Umar Sharif, Rabia Shahid, Marcin Rogawski, Kris Gaj.
Use
of embedded FPGA resources in implementations of five round three
SHA-3 candidates. ECRYPT2 Hash Workshop 2011
Main result: implementations of BLAKE-256 on Virtex 5, Spartan 3, Stratix III, and Cyclone II FPGA devices
- 2011 May 19: Stéphanie Kerckhof, François Durvaux, Nicolas Veyrat-Charvillon, Francesco Regazzoni.
Compact
FPGA implementations of the five SHA-3 finalists. ECRYPT2
Hash Workshop 2011
Main result: implementation of BLAKE-512 on Virtex 6 and Spartan 6 FPGA devices
- 2011 May 19: Xu Guo, Meeta Srivastav, Sinan Huang, Leyla Nazhandali, Patrick Schaumont.
Silicon
implementation of SHA-3 finalists: BLAKE, Grostl, JH, Keccak and
Skein. ECRYPT2 Hash Workshop 2011
Main result: implementation of BLAKE-256 on 130 nm ASIC
- 2011 May 12: Miroslav Knežević, Kazuyuki Kobayashi, Jun Ikegami,
Shin’ichiro Matsuo, Akashi Satoh, Ünal Kocabas¸Junfeng Fan, Toshihiro
Katashita, Takeshi Sugawara, Kazuo Sakiyama, Ingrid Verbauwhede, Kazuo
Ohta, Naofumi Homma, Takafumi Aoki.
Fair
and consistent hardware evaluation of fourteen round two SHA-3
candidates. IEEE T VLSI
Main result: implementations of BLAKE-32 on Virtex 5 and on 90 nm ASIC
- 2010 Dec 1: Simon Hoerder, Marcin Wojcik, Stefan Tillich, Dan
Page. An evaluation of
hash functions on a power analysis resistant processor
architecture. IACR ePrint archive, report 2010/614
Main result: implementation of BLAKE-32 on the Power-Trust platform - 2010 Aug 23: Luca Henzen, Jean-Philippe Aumasson, Willi Meier,
Raphael C.-W. Phan. VLSI
characterization of the cryptographic hash function
BLAKE. IEEE T VLSI
Main result: various implementations of BLAKE-32 and BLAKE-64 on 90, 130, and 180 nm technology - 2010 Aug 23: Stefan Tillich, Martin Feldhofer, Mario Kirschbaum,
Thomas Plos, Jörn-Marc Schmidt, Alexander
Szekely. Uniform
evaluation of hardware implementations of the round-two SHA-3
candidates. Second SHA-3 Conference
Main result: implementation of BLAKE-32 on 0.18 µm technology in 38.9 kGE and achieving a throughput of 3.355 Gbps - 2010 Aug 23: Xu Guo, Sinan Huang, Leyla Nazhandali, Patrick
Schaumont. Fair
and comprehensive performance evaluation of 14 second round SHA-3 ASIC
implementations. Second SHA-3 Conference
Main result: implementation of BLAKE-32 on 0.13 µm technology in 30.4 kGE (resp. 43.5 kGE) and achieving a throughput of 196 Mbps (resp. 845 Mbps) - 2010 Aug 23: Brian Baldwin, Neil Hanley, Mark Hamilton, Liang Lu,
Andrew Byrne, Maire O’Neill, William
P. Marnane.
FPGA implementations of the round two SHA-3 candidates
. Second SHA-3 Conference
Main result: implementations of BLAKE-32 (resp. BLAKE-64) on a Virtex 5 FPGA device with 1118 (resp. 1718) slices and achieving a throughput of 1169 Mbps (resp. 1299 Mbps) - 2010 Aug 23: Shin'ichiro Matsuo, Miroslav Knežević, Patrick
Schaumont, Ingrid Verbauwhede, Akashi Satoh, Kazuo Sakiyama, Kazuo
Ota.
How can we conduct "fair and consistent" hardware evaluation for SHA-3
candidate?
. Second SHA-3 Conference
Main result: implementations of BLAKE-32 on a Virtex 5 FPGA device with 3053 slices and achieving a throughput of 2676 Mbps - 2010 Aug 23: Kris Gaj, Ekawat Homsirikamol, Marcin
Rogawski. Comprehensive comparison of hardware
performance of fourteen round 2 SHA-3 candidates with 512-bit outputs
using field programmable gate arrays. Second SHA-3 Conference
Main result: implementations of BLAKE-32 on Spartan 3, Virtex 4, Virtex 5, Cyclone II, Cyclone III, Stratix II, and Stratix III FPGA devices; for example on Virtex 5, BLAKE-32 is implemented with 1871 slices, and achieves a throughput of 2853.9 Mbps - 2010 Aug 19: Luca Henzen, Pietro Gendotti, Patrice Guillet, Enrico
Pargaetzi, Martin Zoller, Frank
K. Gürkaynak. Developing
a hardware evaluation method for SHA-3 candidates. CHES
2010
Main result: implementation of BLAKE-32 on 0.09 µm technology in 16 kGE (resp. 47.5 kGE), and achieving a throughput of 0.452 Gbps (resp. 9.752 Gbps) - 2010 Aug 19: Kris Gaj, Ekawat Homsirikamol, Marcin
Rogawski. Fair
and comprehensive methodology for comparing hardware performance of
fourteen round two SHA-3 candidates using FPGAs. CHES
2010
Main result: implementations of BLAKE-32 on Spartan 3, Virtex 4, Virtex 5, Cyclone II, Cyclone III, Stratix II, and Stratix III FPGA devices; for example on Virtex 5, BLAKE-32 is implemented with 1851 slices, and achieves a throughput of 2610.6 Mbps - 2010 Jul 5: Nicolas Sklavos, Paris
Kitsos. BLAKE
hash function family on FPGA: from the fastest to the
smallest. IEEE ISVLSI 2010
Main result: implementation of all BLAKE instances on a Virtex 4 FPGA device; for example BLAKE-32 is implemented with 3101 slices and achieves a throughput of 128 Mbps - 2010 Apr 1: Jean-Luc Beuchat, Eiji Okamoto, Teppei Yamazaki.
Compact implementations of BLAKE-32 and BLAKE-64 on FPGA. IACR ePrint archive, report 2010/173
Main result: compact implementations of BLAKE-32 and BLAKE-64 on Spartan 3, Virtex 4, Virtex 5, and Cyclone III FPGA devices; for example on Virtex 5, BLAKE-32 (resp. BLAKE-64) is implemented with 56 (resp 108) slices, and achieves a throughput of 225 (resp. 314) Mbps - 2010 Jan 10: Kazuyuki Kobayashi, Jun Ikegami, Shin’ichiro Matsuo, Kazuo Sakiyama, Kazuo Ohta. Evaluation of hardware performance for the SHA-3 candidates using SASEBO-GII. IACR ePrint archive, report 2010/010
Main result: implementation of BLAKE-32 on the SASEBO-GII FPGA platform with 1660 slices, 1393 slice registers, and 5154 slice LUTs, and achieving a throughput of 487 Mbps - 2009 Oct 21: Stefan Tillich, Martin Feldhofer, Mario
Kirschbaum, Thomas Plos, Jörn-Marc Schmidt, Alexander
Szekely. High-speed
hardware implementations of BLAKE, Blue Midnight Wish, CubeHash, ECHO,
Fugue, Grøstl, Hamsi, JH, Keccak, Luffa, Shabal, SHAvite-3, SIMD,
and Skein. IACR ePrint archive, report 2009/510
Main result: implementation of BLAKE-32 on 0.18 µm technology in 45.6 kGE, and achieving a throughput of 4 Gbps - 2009 Jul 14: Stefan Tillich, Martin Feldhofer, Wolfgang Issovits,
Thomas Kern, Hermann Kureck, Michael Mühlberghuber, Georg Neubauer,
Andreas Reiter, Armin Köfler, Mathias Mayrhofer. Compact hardware
implementations of the SHA-3 candidates ARIRANG, BLAKE, Grøstl, and
Skein. IACR ePrint archive, report 2009/349
Main result: implementation of BLAKE-32 on 0.35 µm technology in 25 kGE, and achieving a throughput of 15.4 Mbps