Skip to main content

Command Palette

Search for a command to run...

SpriteDX - Pixel Alignment - Lab Note 3

Updated
5 min read
SpriteDX - Pixel Alignment - Lab Note 3

So far we have, anti-corruption model that is able to turn blurry pseudo pixel art (left) into a pixel art (middle, right).

It is able to go in the right direction. However has several problems.

Issue #1: Dataset Sourcing

First issue is that the dataset we used is mashup of random pixel arts that I grabbed from internet. This may be okay for experimentation but for publicly available model, I can’t simply scrap pixel arts from web.

One potential solution is to use generated pixel arts. That is we create pixel arts using MidJourney, BFL or other sources. This again will produce upscaled pseudo pixel art with imperfect pixel alignments. However, in previous research, we find that we can do fine-tuning on these pre-trained models to produce much more consistent scaled pixel arts with much less artifacts.

That should solve the data sourcing problem.

Issue #2: White regions being considered transparent

As you can see in earlier image, some of the white regions are being classified as transparent. This being Day 3 of this model’s inception. It is doing the job pretty well. However, that imperfection is quite ugly and is not production ready.

There were few proposed solutions.

  1. [Done] Focal loss on Whites — we weigh the loss term such that white regions get much higher losses. This was implemented yesterday and was highly effective. However, not enough to solve the cases above.

  2. [Pending] Magenta BG — So far, we’ve been using pure white as BG color, and as you can expect, when using pure-white BG, it often categorizes white regions that should be opaque as transparent.

    One possibility is to use Magenta as BG color so that we have less likelihood. With me looking through lots of game sprites system magenta color is almost never used. So, we can require the model to only allow magenta background.

    This does mean that when generating the sprites from AI models, we need to ask it to generate on magenta background. It should be possible to do with our fill-in-the-blank style generator. This may be a solid option. I think we should try this.

    One caveat is that if model only supports magenta BG, people who are looking for general use with other BG colors won’t be able to use this model. So, my thinking is that we would support other BG colors but hone in on the magenta color to be the primary. So that when sprites with magenta BG is provided, it would perform flawlessly.

  3. [Done] Data Augmentations — We added some data augmentations for better generalization. We are doing random crop, random HSL updates. We still need to tune it so that it makes sense but in general this seems to be helping.

    [Pending] There are few other data augmentations I should try. I will try augmenting the BG color so that the model can’t just short cut to “if it is pure white, I make it transparent.“

  4. [Done] Adding More Whites to Dataset — One other thing we did is to add data that has white opaque pixels. This allows for model to learn much more diverse set of shapes.

    The dataset had lots of small sprites but they was lacking in large white areas. So, I curated some list of sprites with large white areas and added them. This was successful to good degree.

    [Pending] We can do more here to add more meaningful whites but, I want to say we should look into other higher ROI approaches first.

  5. [Pending] Focal Loss on Contours — Most noticeable error types are where the opaque pixels meets transparent pixel (i.e. the contours of sprites). Focusing more on these contours should help model focus on the high value pixel areas.

  6. [Pending] Including Generated Samples — Most significant problem in my opinion is that the model has not yet seen various types of generated pixel arts. That is the data distribution of dataset probably only partially cover the data distribution of the generated pixel arts.

    Increasing dataset would be an option but it is going to be prohibitively expensive as we progress. So, we need a more cost-effective way and one way is to include generated samples into the dataset.

    Because of how our model training is set up, we need pixel-perfect sprites. So the reason why we haven’t been able to put in generated samples is because it would require us to re-pixel the pseudo pixel arts into real pixel arts. This tasks unimaginable amount of time (an hour for sprites) at least with the tools we have at hand.

    To fight this issue, we will need to build a AI pipeline that is able to generate perfect pixel arts.

    If there is such pipeline, why are we doing any of this work to fix pseudo pixel arts? SpriteDX spits out pseudo pixel art animation frames. To convert it to real sprites, we need a model to convert pseudo pixel art frames into real pixel art frames. There are different ways we have experimented but our current solution can’t do pixel alignment very well and cost is rather high (30cents per character). I want this down to ~0 cents.

  7. [Pending] GAN to learn pixel corruption — Right now we have hand replicated the impact of pseudo pixel art artifacts using series of transforms. This is all hand-tuned, and sometimes results in data distribution that omits some artifacts. For example, if the pseudo pixel art frame has a subtle motion blur, the model would not have seen such artifact. Also, pixel alignment in generated pixel arts aren’t regular like upscale-translate-downscale. It is much more subtle and can’t be modeled perfectly by human hand.

    To fight this issue, I plan on training a model to learn this mapping from perfect pixel art to pseudo pixel art. To train this model, we will divide the sprite data set into two buckets A and B. For all data in B, we will run it through corruption model (i.e. the generator), and for A, we will feed it to model as is. Then have the discriminator model tell whether A is “generated pseudo pixel art” or “is fake pseudo pixel art.“

There are more issues but, I think above provide enough to get me running for next week. Let me focus on subset of this action items.

My heart tells me that I should work on this GAN setup. My brain is telling me I should focus on generating pixel perfect dataset first.

At least what I know is that I can focus on one of these for next week.

— Sprited Dev