You might have heard of
@TensorFlow
Quantum this week.
What's it all about?
At
@pennylaneai
, we've been developing QML software and algorithms for a couple years now. Since there is now growing interest, we want give a VIP guided tour of the current landscape of QML. [THREAD]
2) Like , TensorFlow Quantum is based on a few key ideas:
-Quantum circuits are differentiable programs
-Quantum circuits can be trained by gradient descent with the parameter-shift rule
-Quantum circuits can be nodes in differentiable hybrid computations
3) First, quantum circuits are *differentiable programs*. Differentiable programming is an emerging paradigm in computer science. Most notable are neural networks/deep learning, but this paradigm is more general. Other cases are "Neural ODEs" or "Differentiable Neural Computers".
4) A programmer specifies a scaffold for the program (the basic logic or flow of data), but leaves many of the program's parameters flexible. This code can carry out many different computations (like how a neural network can mimic any function), depending on the parameter values
5) Differentiable programs are optimized using gradient descent: pick a cost function, compute the direction of steepest descent (the gradient), and move the parameters in that direction. Repeat until convergence. Tools like TensorFlow largely automate this process for the user.
6) Amazingly, all of these ideas can be ported to quantum computers! The Born rule, which specifies the probabilities of different measurement results, is differentiable. Similarly, expectation values (averages) of measurements depend smoothly on the parameters of the circuit.
7) So how do we compute the gradient of a quantum circuit with respect to its parameters? Aren't quantum computations notoriously intractable for classical computers?
The trick here is to *use the quantum circuit itself* to compute the gradient.
8) We don't just mean approximating the gradient using the finite-difference method (though that can also work). It turns out that we can compute the *analytic gradients* of a quantum circuit to within the "machine precision" of our quantum device.
9) The method for doing this, called the *parameter-shift* rule, is one of the fundamental ingredients of both PennyLane and TensorFlow Quantum.
This rule is easy to describe, but magical at first sight. In fact, there's no deep magic—unless you consider trigonometry magic 🧙
10) Say we have a quantum circuit, with a gate G, which has the continuous parameter u. We make some final measurement on this circuit. The average of this measurement is our cost function C(u). We can evaluate this cost by programming this circuit into a quantum computer.
11) How do we compute the gradient dC/du?
The answer is a simple recipe:
-Shift u forward by an extra pi/2 and evaluate that circuit
-Shift u backwards by pi/2 and evaluate
-Take the linear combination [C(u+pi/2) - C(u-pi/2)]/2
The resulting value is exactly equal to dC/du!
12) Why does the parameter-shift rule work?
Here's a simple analogy:
The derivative of sin(u) is cos(u) = sin(u+pi/2).
If we have a device that evaluates the "sin" function, we can find its derivative at any point, using that same device, by simply shifting the argument.
13) Things are not quite so simplistic in the quantum case, though the same intuition holds.
The parameter-shift rule is basically the identity cos(u) = [sin(u+pi/2) - sin(u-pi/2)]/2.
So we need to evaluate the device twice to compute the derivative with respect to u.
14) To evaluate the gradient vector, we need to perform the parameter-shift recipe for every free gate parameter.
Fortunately, this step is embarassingly parallelizable. If we had access to many QPUs, we could compute all the gradient elements at the same time. 😁
15) Of course, working with real quantum hardware, we can only estimate measurement exp values using sample averages. In practice, we will have some small errors (depending on the number of shots). But in the limit of infinite shots, we converge to the exact gradient.
16) We don't have a parameter-shift recipe for every gate. But the recipe does hold widely enough that it can be widely adopted and used even for most circuits. This allows or TFQ to perform "automatic differentiation" even on quantum hardware
17) Finally, since measurement values are just classical data, quantum circuits can easily be used as steps in a larger computation, which might also contains classical processing.
18) In the parlance of ML software like TensorFlow/PyTorch, the quantum circuit is a node in a larger *computational graph*. Only within this "quantum node" is the information quantum mechanical.
19) How can we wire up quantum computers to ML software like TensorFlow or PyTorch?
Fortunately, the devs of these libraries were forward-thinking (though maybe they didn't originally anticipate connecting to quantum computers!). They provide a way to define custom functions.
20) In PyTorch, we need to subclass `torch.autograd.Function` and provide two methods: `forward()` and `backward()`.
With these two pieces, PyTorch now knows how to train quantum circuits or hybrid quantum-classical models, just like it would train a neural network.
21) There you have it.
In a nutshell, those ideas are the basis of modern QML libraries like or TensorFlow Quantum. It's also the basis for the "Qiskit & PyTorch integration" you may have seen from an
@qiskit
hackathon project last year.
22) When we first created PennyLane in 2018, our thinking revolved around these core ideas. It's great to see the ideas starting to take hold more widely; these are really leading-edge concepts.
23) Fun fact: In order to best capture these fundamental ideas in a succint way, the original unofficial tagline we used to use for PennyLane was:
"The TensorFlow of Quantum Computing"
24) This was just a quick overview of some key ideas behind QML. We've been building a lot of educational content around QML at
If you enjoy it, consider following or starring our GitHub repos: ,
Stay tuned!