Загрузка страницы

[TARTA] NES Tetris No Rotation - 45 Lines (**NOT** RTA or TAS; read description)

TARTA (Tool-Assisted Real-Time Attack) is a somewhat "mixed" version of RTA and TAS. It can be seen as a run with near-perfect decisions but with physical limitations same as a real-time run (reaction time, input speed, etc.)

TARTA is almost the same as normal real-time runs, except that player has access to some "tools" that help the player to make decisions. The information that the tools can utilize is the same as the player get (hence no random manipulation).

In this video, the tool used is a program that reads the screen and outputs a favorable piece placement with the input sequence. The output of the program can be seen in the bottom-right corner.

# Why this run is created?

There are 2 main purposes for creating this run:

1. Demonstrating a new strategy to the no-ro category, with the hope of pushing the WR further.
2. Introducing the concept of TARTA. I assume that TARTA can also apply in other categories such as regular high-score runs, and it could probably be more helpful to the RTA community compared to TAS.

# How did you write the program?

I trained an artificial neural network model (consisting of CNN residual blocks) using proximal policy optimization (PPO) algorithm. The training took about 5 days on an RTX 2080 Ti GPU.

The model achieves an average of 24.35 lines. It achieves 30 lines in about 26% of games, 35 in 12%, 40 in 4.5%, 45 in 1.5%, 50 in 0.4% and 55 in 0.1% of games. When doing TARTA, some next-piece-dependent top-row placements become infeasible, but the overall performance is still desirable. I played only ~50 games before this run, and got 36 lines in 3 of the games.

Here's the model and the source code: https://github.com/adrien1018/noro-tetris-ai

# About the strategy and some model characteristics

It seems that the (near-)optimal strategy is to build a well on the left and tuck everything to the right (imagine rotating the whole field clockwise by 90 degrees), and build the right stack as accommodating as possible.

In no-ro, there is theoretically no asymmetry except piece movements, which I didn't pose any limit in the training process. However, I found that the model chooses to build the well on the left almost every time. After some research, I realized that it is due to the non-uniformity of the RNG in NES Tetris. Assume the internal RNG is statistically random, the piece spawning process can be formulated with a Markov chain with the following transition matrix (the row/column order is TJZOSLI):
[1 5 6 5 5 5 5
6 1 5 5 5 5 5
5 6 1 5 5 5 5
5 5 5 2 5 5 5
5 5 5 5 2 5 5
6 5 5 5 5 1 5
5 5 5 5 6 5 1] / 32
Notice that S-J sequences are more likely to happen than Z-L sequences, which make the left well more preferable. There are other asymmetries in the transition matrix, but I assume that the S-J sequence is the most decisive one. Though the model scored only 0.2 lines less on average if I reversed the spawning behavior of S/Z and J/L, so the difference between left-well and right-well is not significant.

(https://meatfighter.com/nintendotetrisai/ describes the RNG mechanism. However, it incorrectly assumes that all tetrominoes are spawned uniformly, thus giving the incorrect final distribution of pieces. The correct distribution is the left eigenvector with unit eigenvalue of the transition matrix, which is [T,J,Z,O,S,L,I]=[1369/9331, 47989/335916, 1334/9331, 1/7, 37/252, 5/36, 5/36], or approximately [14.672%, 14.286%, 14.296%, 14.286%, 14.683%, 13.889%, 13.889%].)
I also found that the model utilized the non-uniformity a lot: when training with uniform RNG (but evaluating on NES RNG), the average score turned out to be about 1~2 lines less. The non-uniformity also makes the no-ro slightly easier: when evaluating the model trained with uniform RNG, the average score on uniform RNG is about 2~3 lines less than NES RNG.

It would be harder to take RNG non-uniformity into account in RTA runs, but some easy factors (e.g. the same piece spawning back-to-back is unlikely) can be utilized when building dependencies.

Видео [TARTA] NES Tetris No Rotation - 45 Lines (**NOT** RTA or TAS; read description) канала Adrien Wu
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
22 ноября 2020 г. 21:01:20
00:09:37
Яндекс.Метрика