mind:atlas-developer-log [CS Wiki]

Sprint 1: Prometheus

[May 2 - May 13]

Objectives: Flow analysis, generation complete

May 2

Investigated the possibility of using an evolutionary algorithm instead of a formal grammar for generating flow maps in the high-level Sketch Phase.

May 3

Expanded the evolutionary algorithm to include a full flow-map instead of simple points. For now, this simply includes the graph of the flow-map: nodes in the graph have a position, size, and a list of undirected connections with other nodes. Those connections could eventually have weights (distances), but cartesian distances can be computed on the fly between points, so that might not be necessary.
I listed out the remaining unknowns for this evolutionary representation:
- Exactly what properties of a flow-map need to be tracked and measured? Which are characteristic?
  - I need to run some quick experiments on a bunch of real and toy levels to see their flow-maps and list the properties of those flow-maps which describe essential characteristics about the levels.
- How to track these properties on-line, without the need for a global re-computation at each generation?
  - I need to store this information in the flow-map itself keeping some on-line iteratively updatable storage mechanism for each property.
- What intentional mutations will alter these properties?
  - I need to take each property, and hand-develop the methods for increasing/decreasing that property on offending aspects of the graph.

May 4 - 7

No progress

May 9

Experimented with the notion of a two-phase genetic algorithm for layout generation:
- The first phase was simply concerned with getting the high-level global distributions of positions and sizes of flow nodes correct (no connections).
- The second phase was concerned with connecting those nodes in accordance with the general node-node connection distributions from the source level.
Results: The problem was that in a flowMap, the connections were essential. Moving a node to the left or right might fundamentally alter the flow of the level (creating a cycle where none existed, or breaking one). In that sense, it was nearly impossible to generate just the nodes, and then just the connections with genetic algorithms. Very little of the important structure of the original level was preserved.
I believe this is where some global intentionality can help – I'm going to try to integrate the Act, Evaluate, Decide loop into the genetic algorithm and see if it can handle simple tasks like the one I tried above. Perhaps a deliberate repeated three-stage loop will help where a simple two-stage process failed. I'll upload screenshots to show off the results tomorrow.

May 10

Needed to do a bit of a refactor of my FlowMap structure to better enable it to iteratively update global properties.
Said refactor was getting a bit messy, I needed to go back and write some unit tests to ensure that connections between nodes were being tracked as I expected them to be.
Designed (but didn't run) a couple experiments I need to do to verify a few of my hypotheses:
- Given a long, thin winding flowmap in which all the nodes are randomly positioned, can a genetic algorithm efficiently reposition all of the nodes such that it forms a desired structure (two such examples of a desired structure are (1) a tightly-packed maze, and (2) a loosely-packed maze.
- Does the ability to add and remove nodes (not just reposition) speed up the convergence process?
Even though GAs have their own limitations and flaws, I'm becoming more and more convinced that they are well-suited to the first, high-level “sketch” pass of the level generation process. In fact, I think they're much better suited to the task than a formal grammar – I'm officially scrapping that as an option.

May 11

I ran the experiments that I designed yesterday. Convergence is the hard part, but with deliberate mutations and relaxable constraints, an evolutionary approach is a good one. I'm trying to remove as much 'random' as I can from the mutation process to speed it up.
- Test 001: Random Generation test with new refactor
- Test 002: Deliberate mutation test - only those nodes with violating properties were selected, and constraints (inter-node distance) could be relaxed.

May 12

Refactor – the ability to specify arbitrary:
- Properties of the flow map
- A mechanism for detecting nodes or connections that violate those properties
- A mechanism for attempting to correct those violations
The core problem here is giving the Genetic Algorithm the ability to aim toward multiple fitness functions at once (this may necessitate experiments with hierarchical explorations and backtracking). The more fitness functions you add, the harder it gets to converge (because future changes tend to clobber old progress, and because standard GAs cannot make deliberate short-term sacrifices in fitness in order to attain a higher future fitness – the best they can do is keep some lower-fitness members around each generation in the hopes of accidentally making that sacrifice/boost).
Test 004: Multiple properties being satisfied. It first tries to find a node that violates 'internal cohesion' – it's connections must be 3-5 tiles away. If there are none, it finds a node that violates 'external cohesion' – it's closest non-connected neighbor has a buffer space of exactly 1 tile between. If any violating node is found, the node is 'nudged' to a random available location. Notice how even though there are other options for nodes to tweak, it gets 'stuck' on one specific node. I'm going to switch from a “iterate through all nodes” approach to a “iterate randomly through all nodes” approach tomorrow to mitgiate this.

May 13

No progress

May 14

Test 005: Instead of randomly nudging an offending node to an available space, I experimented with nudging it to an available space roughly near the original position. The problem here is that nudging the node randomly is not a good targeted strategy for fixing violations of specific node properties (such as internal or external cohesion).
Instead, I ensured that specific properties come paired with specific methods of 'fixing' those properties if violated. For example, if Internal cohesion is too close, a specific method would be to nudge the node AWAY from its connected neighbor. On the other hand, if External cohesion is too close, we'd want to nudge it away from its disconnected neighbors.
~evil cackling~ WE HAVE CONVERGENCE on multiple fitness functions! The following experiments were run on different types of mutation methods for Internal and External cohesion for a given level to see if Atlas can maximize both at the same time without clobbering eachother.
- Test 006
- Test 007
- Test 008
- Test 009
Hypothesis 1: In order to add more fitness functions, it may be necessary to specify a priority order (instead of just a round-robin rotation each fitness). This way, Atlas will either (1) always attempt to improve heavier-weighted fitnesses before tweaking with the lower ones, or (2) a weighted round-robin where heavier-weighted fitnesses get extra 'turns'.
I thought of a new idea (kind of cheating) – start with the literal flow map of the ORIGINAL level and mutate that while keeping the general global properties intact. Eventually it should arrive at a unique level, but with the same properties, and it may overcome the random bootstrapping issue I've been seeing (connections have a hard time untangling themselves from random positions).

Sprint 2: Hyperion

[May 16 - May 28]

May 16

No Progress

May 17

Quick description, more to come later: I'm experimenting with a different formulation of a genetic algorithm – one which discards the notion of 'connections' in the flow map for a level. We already have a mechanism which can break a source level down into convex 'components' made from variously sized and positioned squares. These components can be perfectly described by their skeletons (the maximum inscribable squares within that shape). I have a hypothesis that if you took the bounding rectangle around a shape, and it is mostly filled-in, then that shape can be described as a cut-away from the whole rectangle (negative sculpting). Likewise, if the bounding rectangle is mostly empty space, (such as a twisting, winding tunnel), then the shape can be described by its positive components (positive sculpting). I think I can make a genetic algorithm which performs positive/negative sculpting to generate shapes. I'm going to try it out on two different real-world level types (1) Crypt of the Necrodancer, and (2) Rust Bucket. One is primarily a negative-sculpting method, and the other is positive-sculpting. My formulations of this method have all been on paper today, I'll try to code it up and see how it compares in practice tomorrow.

May 18

Started coding up the sculpting algorithm
- It's slower than the hyper-optimized “node nudging” algorithm that I'd created last week, and so the genetic algorithm no longer runs at 60GPS (generations per second).

May 19

To accomodate these changes, and visually evaluate the new sculpting process quickly, I started coding a tracker to log the best level flow of each generation, and be able to “replay” the evolution at 60GPS.

May 20

To retain my sanity and avoid burnout (and spend more time with family) I think I'm going to have to declare Fridays a “no thesis zone”
No progress

May 21

In preparation for the “space sculpting” my genetic algorithm version of the level flow sketch, I did a number of things:

Step 1: Component Layout (in which the algorithm lays out several points representing component cluster centers on the level. Remember that component clusters are the largest convex shapes found during the analysis phase, and tend to be things like rooms and large open spaces).
- Defined a component layout genotype and phenotype
- Defined several fitness functions to preserve the original stylistic layout during generation of a new layout:
  - Number of components (should be similar to original layout number)
  - Histogram of distances: component centers to level center (should be similar to original layout histogram)
  - Histogram of distances: component centers to component layout center (should be similar to original layout histogram)
  - Histogram of aggregate local properties of components:
    - Number of cartesian nearest-neighbors
    - Number of tile nearest-neighbors
    - Cartesian distance to nearest-neighbor(s)
    - Tile distance to nearest-neighbor(s)
    - Angles between cartesian nearest-neighbor(s)
    - Angles between tile nearest-neighbor(s)
Step 2: Shape of individual components (high-level sketch of the shape of the components, such as rooms)
- Defined a component shape genotype and phenotype
- Defined several fitness functions to preserve the original stylistic shape during generation of a new component shape:
  - (Largest size squares first) number of elements
  - Various histograms of the positions, angles, and distances of the elements within that shape (similar to the component layout metrics above)
  - Sets of histograms on successively smaller elements (including metrics on how they are positioned relative to larger ones).
  - This is where the 'sculpting' comes in. For convex component shapes, we're generating them using positive-space sculpting, starting by trying to generate a course-grained shape, and iteratively stepping up the level of granularity. In my paper-tests, this method was far more promising than taking a shape-flow and trying to mutate it to fit all the stylistic properties of a different shape-flow.

It's almost to the point where I can run tests, but there's still a few missing pieces of code. I'll be working to get it ready this week. If all goes well, the 'high-level sketch' phase of generation can be completed by the end of the week.

It should be noted that this method comes at the cost of generality. I now make the limiting assumption that a source level can essentially be broken up into (1) convex component shapes and (2) paths between those shapes. Though many real-world levels I've tried do have this property, I doubt this will be true for all possible arbitrary source levels. I'll need a way to formally describe what Atlas can and cannot use as a source level, and informally describe a few examples of each category.

May 23

Furiously coding up the above changes.

May 24

New Component Layout phenotype/genotype complete, fitness functions almost complete. It's just about ready for a spin.

May 25, 26

Working very late at Qualtrics, no thesis progress. Will work to finish this current method over the extra weekend time

May 28

TONS of scrapped progress – got to about 9:00pm after a ton of refactoring and experiments with genetic algorithms. Unfortunately, none of my parameters resulted in a satisfying (or even coherent) level sketch. I need to take a radically different approach. The metrics I previously defined for the level sketch are not nearly as good in practice as I had hoped. Using histograms as the basis of comparison on these metrics was also a mistake – for example, two maps that only differed by a single node, if that node was <A> far away from center, or <B> far away from center, both maps would be considered 'equal' if they differed from the desired histogram, because of the way I'm comparing the histogram bins.
I will perform a Root-Cause Analysis and pivot to rectify this mistake.

Sprint 3: Tethys

May 31

Performed the Root-Cause Analysis on my earlier experiment failure
Composited all current real-world source levels for one game for a more thorough analysis of 'style'. These rough composites comprise 10 full levels all conforming to a particular theme.
Informal study: watched an illustrator create a new level based on real source level – interesting note, they required multiple source levels to get a feel for the 'characteristic style'. A single level was *not sufficient* as a source. Perhaps it would be a good idea to widen the inputs to Atlas from 'a single source' to 'an arbitrary number of sources'. Atlas than could then focus on commonalities found in multiple levels (an aspect of layout analysis that is currently working).
Informal study: did some investigation of the 'remix' theory of creativity, which posits that 'everything is a remix' consisting of three activities: copying, transforming and combining. Perhaps this combine/transform/copy methodology would serve Atlas better than the decide-act-evaluate loop that I was trying to implement.

June 1

Source level component analysis begun.
Completed top-level hierarchical component analysis for the first set of 10 source levels for game 1. Interestingly enough, most high-level components of these levels are simple rectangles.
The astute viewer will notice that these levels contain both many similarities and many differences. For example, there are simple similarities (such as a fairly linear path), and more complex stylistic similarities (adding those little extra non-pathable single tile pieces hanging off of the main rooms, purely for decoration). This supports my new hypothesis that the 'characteristic style' doesn't make much sense for just one level (need at least 2, or a set), and that the 'characteristic style' of a set of levels is more of a collection of interesting similarities and dissimilarities than it is a particular range of formal metrics.

June 2

Composited 30 more source levels (top-level hierarchical component analysis almost complete).
Started drafting a generation approach based on 'copy, transform, combine' theory. I'm working on a general placement solution so that I can at least make the 'copy' part work this weekend.

June 3

Finished compositing Rust Bucket's hand-made puzzle levels.
It's interesting to note that all the levels of Rust Bucket generally conform to certain styles, such as a particular oblong aspect ratio of rectangular rooms and thin winding paths between rooms.
I want to compare at a minimum: (1) Rust Bucket's procedurally generated levels (should be even more cohesive in style, as all levels are generated using some hard-coded algorithm), and (2) Procedurally generated levels for a second game (Crypt of the NecroDancer).
My previous mistake was assuming that a single level always has a “characteristic style”. While that may be true to some degree, a single level is not quite enough to extract out a unique style. Instead, we must take as input a set of 2-3 levels and find the similarities/differences. Similarities are part of that characteristic style, and differences help inform the range of variation of a level in that style (which characteristics are less important to keep the same).

June 4

Defined a spectrum of generation between “Consistency” and “Originality”:
- 100% consistent generation produces levels identical to the source levels. Exclusively uses the “copy” technique.
- 100% original generation produces levels that are as different as possible from the source levels. Exclusively uses the “transform” technique.
- In the exact middle, Atlas should use a mixture of “copy”, “transform”, and “combine”.
- I would like for a variable along this spectrum of generation to be included as input.

June 6

Other commitments: no real progress

June 7

Finished the shape analysis on the next set of hand-made levels for Game 1.
Did a quick manual exercise: took two levels and tried to “combine” them by atlas-combo-experiment-1 hand on paper.

June 8

Started work on the generation algorithm using the “copy, transform, combine” creative methodology. The simple version – given a group of source levels, Atlas identifies major shape components (usually turns out to be rooms), and copies them into the generated level one at a time. Still working on this part (particularly the layout algorithm of where to put the components).
Fleshed out the algorithm for arbitrary convex shape component generation. For “copy”, there is no special generation needed (just reproduce the exact shape). However, for more advanced operations (like “transform” and “combine”), a methodology of creation is required.
- For Transformation, a simple “sculpting” method will do for now. Atlas can carve off or add pieces from the edges of a shape. Later “transform” operations might be a 'stretch' or 'extrusion' of a 2D shape.
- For Combination, we need to analyze the key similarities and differences between the two shapes being combined. A combined shape ought to retain much of the similarities found in the source shapes, and pick one of the differences to display. For example, (1) a rectangular room with a hole in the middle combined with (2) a square room with ragged edges. Both share a rectangular similarity in shape, but have key differences. Combination could result in a square room with a hole in the middle.

June 9

Sync-up with Dr. Ventura
Implemented the atomic “copy” operator (easy, it just copies a shape)
Started groundwork on a few atomic operators for “transform”, two additive transformations (extrusion, stretch) and one subtractive transformation (cut).

June 10

Friday: Thesis Break (no progress)

June 11

Code refactor: Getting ready for atomic operations (transform, combine). Needs a better code structure for shapes in order to do fast analysis.
Work towards a fast random shape generator using the new shape structure

Sprint 4: Pandora

[Milestone]: Second-Stage Level Generation Complete

Sprint 5: Mimas

[Milestone]: Third-Stage Level Generation Complete

Sprint 6: Calypso

[Milestone]: Full Paper Draft Complete (sans final validation results)

Sprint 7: Iapetus

[Milestone]: Full Paper Complete with Validation Results

Sprint 8: Janus

[Milestone]: Final Revisions Complete

Sprint 9: Atlas

[Milestone]: Schedule Thesis

Sprint 10: Titan

[Milestone]: Defend Thesis

mind/atlas-developer-log.txt · Last modified: 2016/06/13 22:09 by martindm

Back to top

Table of Contents