Mechanics: CSP – Part II – Universal Solver, Good For Nothing

Ok, ok, let’s talk more about CSPs and how you can build content by solving CSPs. This is the second part of this series, and it’s highly suggested you read the previous entry. This time we’ll talk about methods of solving constraint problems in theory and practice. It will be pretty handwavy, but there are things I want to establish before moving on.

So let’s start by going over some algorithms. Specifically, we’ll touch on a rather interesting topic first, a general solution to CSPs exists, and you shouldn’t care about it. That’s generally not what you expect to hear, but we’ll try to work through the details and present an actual, practically workable solution.

1) The General CSP Solver

We’ll be dealing with the Backtracking With Forward Checking algorithm I mentioned in the previous article.

So, let’s start with some notation, let:

S – our space, the Cartesian, but in this case, we’ll treat it as a set of variables we must assign to solve the problem;
s – a particular valuation of S, s ∈ S;
xₐ – the a-th variable in s, the initial value of every variable is the symbolic value null;
Dₐ – the domain of the a-th variable, or the discreet set of all values that can be assigned to xₐ;
Kₐ – the set of variables that appear in the a-th constraint
D’ₐ(s, C) – the domain of the a-th variable, but limited to those values in Dₐ that do not contradict any constraints; if we don’t know how to do this, we can assume D’ₐ(s, C) = Dₐ and proceed without the “forward checking” part of the algorithm;

Let’s stop here for the first time. The process of deriving D’ₐ(s, C) is not trivial. I’m assuming here that we can do it, and as a general rule, we can, but it’s not always easy, nor is it always particularly efficient to do. It’s trivial in the case of Wave Function Collapse but doesn’t need to be simple when we can arbitrarily construct constraints. This happens especially when dealing with the interplay between multiple variables and conditions.

Now for the pseudocode:

SolveCSP(S, C):

n := |S|
s := {null}ⁿ
return SolveNextVariable(s, C)

This will be our initial function. It begins by setting all of our variables to null, and then it proceeds to solve the next variable. Note that optionally, we can lock in some variables, making the process more guided. This can also be represented by changing the content of set C, though.

SolveNextVariable(s, C):

for each cₘ ∈ C: if ¬cₘ(s) ∧ ∀(xₐ ∈ Kₘ) xₐ ≠ null: return null
if ∀(xₐ ∈ s) xₐ ≠ null: return s
if ∃(xₐ ∈ s) |Dₐ(s, C)| = 0: return null

k = argmin(a){|Dₐ(s, C)| + uniform(0, 1⁻)}
Vₖ := shuffle {v : v ∈ Dₖ(s, C)}

for each v ∈ Vₖ:

xₖ := v
r := SolveNextVariable(s, C)
if r ≠ null: return r

return null

Let’s go over this line by line. We start by checking every constraint. If we know the values that define this constraint and the constraint is not satisfied, we have a contradiction, and we perform what we call a “backtrack”, hence the name of this method. If we ever backtrack out of the root function call, we have an unresolvable contradiction (i.e. the CSP cannot be solved).

Next, let’s see how many variables remain unassigned. If all variables are indeed assigned, and there was no contradiction (previous line), we have found the solution and return it.

At this point, we can also officially check if we have a variable that cannot be assigned anymore. This is an early warning for a contradiction. It means that while no constraints are broken at the moment, assigning anything to that variable will do so.

Now let’s assign to k the index of the variable, for which the domain size modified by a random number in the range of <0, 1) is the smallest. Speaking otherwise, let us select such a k, that the variable is randomly plucked from among variables with the smallest domain cardinality.

Next, we make a randomly shuffled ordered list of all values that can be assigned to the k-th variable.

We’ll be iterating over these, trying to find a solution to our problem.

To do this, we start by setting the current value of xₖ to v and attempting to recursively call SolveNextVariable(s, C) if this succeeds (we don’t get a contradiction and return null), we’ve found a solution.

If we went over all possible values in Vₖ and didn’t find a solution, we backtrack.

There, that’s it; that’s a general solution to CSP. So what’s the issue with it?

2) Yes, But…

You see, there’s this fucker:

D’ₐ(s, C)

(Exhibit A: The fucker in question.)

I said what this is but never really explained how to achieve this algorithmically. And the reason for that is… well… it’s incredibly hard, possibly impossible, to give a fully general explanation of how to calculate this function. It’s very deeply dependent on your implementation of constraints and how you build your underlying data structures. At some point, even if you can always correctly calculate D’ₐ(s, C), you might find yourself suffering slowdowns in your algorithm. Yes, by trying to use Forward Checking to speed up solving the CSP, if you go overboard with it, you might find that the complexity might even exceed the complexity of brute force solving CSP (that’s the situation where we assume that D’ₐ(s, C) = Dₐ in all cases). And that’s no good.

For the previous case of Wave Function Collapse, the process of keeping your domains updated is quite simple. If you place a tile, that at most limits four adjacent domains. Same if you backtrack, you can just update four domains, and you don’t need to keep memory of your solution stack.

But the actual shot to the knee here is that often Forward Checking just isn’t needed. But we’ll discuss it in a moment. There’s one more issue with this whole generalised process, and it’s this line:

for each cₘ ∈ C:
if ¬cₘ(s) ∧ ∀(xₐ ∈ Kₘ) xₐ ≠ null:
return null

(Exhibit B: The other fucker.)

To be just frank, just defining all your constraints as a list sucks. Sure, as long as you keep your conditions uniform and straightforward, there are many ways of optimising this, but that’s the thing – the solution isn’t general in that case. So for a general CSP solver, we simply have a set of constraints because that’s the only genuinely general way of presenting it.

And just checking constraints can be highly time-consuming. We can apply overarching, complicated to calculate global conditions to your problem and find that our CSP can’t even verify them until it fills out a large number of variables, but at that point, it has to start backtracking to avoid them. Global conditions are also tough to work into Forward Checking.

For a simple example, think of generating a dungeon. How do you express the constraint of “the player can access every tile containing a walkable surface”? Don’t get me wrong, it’s possible to do through propagating accessibility from the point of entry, but it’s costly, and as long as you have disconnected tiles that could still be connected in the future, the construction remains up in the air.

3) So what?

Understanding CSP in the general sense is a valuable tool to help you understand problems you encounter when designing content generation systems based on it. However, a pure, generic approach to CSP is naive and doesn’t offer an effective, fast solution.

In most cases, when randomly generated content is used in a game, it has to be generated at run time, in many cases in a short time, during a loading screen.

Sure, there are cases where procedural generation (mostly) happens once, when starting a gameplay session, or can be done before the game is even shipped. But, unfortunately, those are edge cases. And even then, it’s not like a more efficient approach wouldn’t be good even in that situation.

In reality, you actually need to write your own specialised CSP solver for your own specialised subclass of CSPs. Hence what I said just at the start of this section. The general solver is a template from which you build your own specialised solution. You need to understand how it works, but you won’t be implementing it.

Further parts of this series will go on to actually implement an example of this, but for the moment, to finish this part off, have a few general observations.

A CSP solver can be built to ensure some constraints are satisfied without ever checking them.

Let’s go back to what I said about the dungeon building. You don’t need to make “all rooms must be connected” a constraint if you build your solution so that disconnected rooms cannot be constructed.

Going through the variables in order of domain size isn’t always the best way.

If you have global variables that state “it must be that X” (as opposed to “it cannot be that Y”), it might be possible to order your variables in such a way that you prioritise solving the constraint. For our dungeon example, if our goal is to make sure five Magical Crystals of Thyme are placed somewhere in the dungeon, it might be best to ensure they are placed within the first 5 recursions of the algorithm.

Divide and conquer your generation.

While it might be tempting, at first, to sculpt some kind of big, monolithic generator for your content, basically every successful generator I know divides the process into smaller problems. Again, I’d like to emphasise this, all of the good generation algorithms I’ve ever seen, and I’ve seen many, divide the problem and iteratively generate more detail.

Let’s get back to the previous example with the Thyme Crystals. You can divide your dungeon generation into two stages. First, generate the dungeon topology (room layout), then generate dungeon features (the details contained within the rooms). The magic salt rocks can be distributed during the second phase.

But you might notice that the second and third point presented above clash. This is because they offer different solutions to the same problem. Which is very much the case, yes. When you design your own content generator, it’s up to you to decide which would make more sense. Additionally, there is no need to strictly adhere to the CSP approach when generating. Indeed, it’s possible to mix and match procedures when solving your subproblems and even use CSP to augment other common generation algorithms (like cellular automata).

Next, we’ll actually design our own generator. Or at least start.

Leave a comment