Swerve drives are fun. While sometimes impractical they check off more boxes than any other drivetrain. Not only can you drive them in any direction, but you can also rotate independently, all while using conventional wheels aimed in the optimal direction for force. Unlike mecanum wheels, which have found use in forklifts, swerve is almost exclusively used by FIRST robotics groups.

Mathematical Model

A swerve drive takes two inputs for control: the desired translation and rotation. This maps to kinematics definitions of a velocity vector and angular rotation, which I’ll call $$\vec{v}$$ (m/s) and $$\omega$$ (rad/s). The outputs are actually motor values for 2x the number of modules (for pivot and drive motors), but for now, lets abstract this away and pretend every module takes a vector.

Here is where this definition will diverge from many other implementations. We will solve this in the general case, so $$n$$ modules at arbitrary locations. Most implementations have 4 modules in a rectangle around a center, but the general case is both closer to the underlying physics and (at least to me) easier to implement well in modern languages.

Now, since we have angular velocity we need a reference point. If we assume an origin exists at $$(0,0)$$ we can define the location of each module relative to that. Therefore module $$n$$ is located at $$\vec{m}_n$$ relative to the origin (in meters).

With this we can define a simple frame like this with just four vectors around a center point:

This allows for what is essentially the fundamental formula of swerve drive:

$\vec{\text{output}}_n = \vec{v} + \omega\cdot\text{perpendicular}(\vec{m}_n)$

This doesn’t specify whether you need the clockwise or counter-clockwise perpendicular function, but as long as it agrees with $$\omega$$ it doesn’t matter.

By using this general form of swerve drive we get support for much more powerful operations simply by changing a few variables. For example, just by changing the module locations in $$\vec{m}$$ you get support for arbitrary shapes:

Also, since the perpendicular function maintains length we get rotational scaling for free:

The final benefit of this model is that the center of rotation is arbitrary. Even if most of the time a center-based rotation is desired, the center can be moved anywhere on the 2d plane if needed.

Below the high-level kinematics model, a basic implementation can just be a cartesian to polar conversion. However, there are a few parts that stray from the theoretical model that allow for better control using physical motors.

Speed Normalization

To get the best performance out of the drive train you should be running it at a speed where it occasionally maxes out the motors. For example, the module output vector lengths at some point could be [2.30584285 1.80334026 0.55053495 1.538819]. Since a motor should never be set to over full power, there needs to be defined behavior on how to handle this problem.

Option 1: Clamping

The naive method is clamping and is most likely what will happen by default if speed normalization is ignored. Here this results in values of [1.0 1.0 0.55053495 1.0]. This distorts the final direction, so this should never be used.

Option 2: Pre-normalize

This method relies on finding the highest possible speed and scaling down each value accordingly if it is greater than 1. The highest possible speed is always whfen the translation and rotation components are in the same direction, so we can just add their distances.

$\text{scalar}=\min\left(\frac{1}{\left|\vec{r}_\text{max}\right| + \left|\vec{v}\right|}, 1.0\right)$

This relies on knowing $$\vec{r}_\text{max}$$, which will always be the $$\vec{r}$$ of the furthest module from the center of rotation.

$\vec{r}_\text{max}=\omega |m_\text{furthest}|$

This approach is the simplest and guaranteed to never exceed 1, but it may be slightly too conservative. For example, this results in values of [0.99638291 0.77924539 0.23789289 0.66494252], which is very close to the highest. While only $$0.4\%$$ is lost in this example, it may be more pronounced in others.

Option 3: Post-normalize

This approach just takes all the speeds and divides them by the highest, if it’s larger than one. In the example, this produces values of [1. 0.78207422 0.23875649 0.6673564]. This method produces the best values but can be a little weird to implement if you separate each module into its own object since they would all need to communicate their values.

Direction Flipping

Turning a module to perfectly match the target vector is often unnecessary since the opposite vector with a reversed speed accomplishes the same thing. Taking this into account, one module should never have to travel more than 90 degrees to reach its target.

This can be implemented with a simple if statement, but there is a better option using vectors that also fixes the stray module problem.

Stray Module Problem

A common problem is one module taking a different path from the rest. This results in the wheels fighting each other and stalling.

While this isn’t too bad for a single target, when the target is constantly changing this can become a large proportion of the total runtime.

Option 1: Mitigation

The simplest option is to slow down the modules when they are pointing in the wrong direction. A quick and easy way to do this is ot multiply them by the cosine of the angle difference, $$\theta$$, which is easy to do with vectors:

$\cos\theta=\frac{\vec{\text{target}}\cdot\vec{\text{current}}}{|\vec{\text{target}}|\ |\vec{\text{current}}|}$

If a more aggressive limit is needed this scalar can be raised to an exponent, which also won’t change the domain. Here is an example of speed scaled by $$\cos(\theta)$$ and $$\cos(\theta)^3$$:

Another benefit to cosine scaling is that it also will take care of reversing the drive wheel when needed because the resulting cosine will be negative. The only drawback is that you can’t directly use even exponents, but that’s pretty much irrelevant.

Option 2: Explicit Avoidance

Another option is to explicitly avoid having modules take different paths. Essentially, “if one is going to fight the others have it follow the others”. This can only be accomplished with some pretty ugly hacks, but it can be done.

One way of doing is to shift the module flip windows. When each module is deciding which way to go it looks for the shortest path to being in line with its target. This creates two 180 degree windows, and the one it currently falls on decides which direction it is going to use. If you were to take the average of module rotations (or really their derivatives) and shift this window in the opposite direction proportionally this would effectively make each module to follow the rest if it’s close enough. This introduces another constant for how large this shift is. It would need to be tuned to be just larger than the usual error in rotation so it can capture most of the stray modules.

Going Further

Swerve drive gets programmed with a simple physical model assuming perfect inputs. However, swerve algorithms are not a problem with an ideal solution that can be derived or even expressed with conventional mathematical models.

A perfect control system would take into account these three separate factors:

• Time series control
• Noisy I/O
• Full robot kinematics

Each of these makes the final algorithms more complex. The final two are soft-computing problems as well, so there may not necessarily be clear ways to improve on them. So is this as good as it gets? In practice, yes.

But there is another option to take it further: neural networks.

Swerve, at least in 2d, is really just a function that takes three numbers; $$\vec{v}_x$$, $$\vec{v}_y$$, and $$\omega$$; plus $$n$$ encoder inputs, and outputs $$2n$$ motor outputs, for the drive and pivot motor speeds. And $$n$$ is almost always 3 or 4. So seven linear inputs and maybe 8 linear outputs, and almost all the operations are additions and multiplications. This is just about the perfect use case for a small neural network.

However, this doesn’t take into account time series data, since it’s still stateless. To fix this we can give the network some memory, specifically as an LSTM.

Training this certainly would not be easy, and even less so to test, but this is the best option to move forward.

Source code

All of the graphics here were created from this jupyter notebook (HTML).