Do you think I could just leave this part blank and it'd be okay? We're just going to replace the whole thing with a header image anyway, right?
You are not logged in.
Ohai. This subforum seemed kinda dead, so I saw Zoey's post about a world string and thought it might be an interesting StackOverflow-esque challenge, if anyone's still interested in bots.
Task:
Design a serialiser (a world-saver encoding) that can store all the blocks in a world in one line, in as few characters as possible.
If you can code, write the code that would encode and then decode + draw that world string. Don't post it yet though, so people don't copy.
Rules:
1. The string must contain no whitespace - no spaces, no tabs, no newlines; just black on white with nothing in between;
2. The string must contain the world size;
3. The string must work on all/most/many computers - i.e. don't design a Base-512 encoding with lots of Chinese characters that other players can't even render. None of this: ▯▯▯▯
4. The encoding must ofc work on all world sizes;
5. The encoding must work for all blocks. Switches, world portals, signs, mod text, everything;
6. You must be able to decode the string in a realistic amount of time. Say 3 mins tops.
Testing:
The following levels have been chosen to provide fair and unfair representations of EE worlds. The best encoding is the one that requires the fewest total characters for these worlds.
1. Empty world: PWW9KtOqWFcEI
In theory, an empty world (or a world with just the default border) should generate the shortest world string. (Hint: do you even need to encode the blocks into the string for a world like this??)
2. 200 lava minigames: PWKK8zFHH8bEI
A popular minigame level with lots of repeated blocks. (Hint: can you optimise the world string for a world with few special blocks?)
3. Tutorial #1: PWL17t1R6bbUI
It's not the original, but it's another opportunity to really compress that string. And it's got mod text!
4. Most of the blocks: PWtVBnyNUobEI
You have no idea how long this took to make. It might not have every single block, but it'll make one helluva world string. Probably the worst-case scenario in a challenge like this.
Notes:
A good encoding must compress the data while eliminating as much redundancy as possible. The baseline for comparison, I suppose, would be EE's own serialiser - if it were shoehorned into one line and the whitespace removed. Obviously it's the serialised "init" data that you'll be compressing (or the BigDB data if you're really showing off), so a world string that's longer than that would be pretty unimpressive.
Can you do better?
Post your world strings below with their lengths. If you post all 4 together they must be generated using the same code. You don't have to do encode all 4 worlds at once, but you should give a general overview (without giving too much away) of how your encoding works.
Feel free to discuss different ways of optimising world strings, such as:
- Using specific characters to indicate patterns, scenarios, block-sets etc;
- Storing lines or groups of blocks as well as individual blocks;
- Storing the inverse of some block IDs - e.g. start with BG everywhere and then remove some of it.
Deadline: EE's death.
Prizes: none / the satisfaction of showing you're the best / gems if someone else provides them. It's a challenge, not a competition.
Judges: meh. let the length of the string be the judge.
Thanks to: Therealrein
If you can't take part because you don't know how to work with the mapdata in "init", try this post.
Assumptions:
1. The largest portal ID you can expect is 99,999 - this is the largest ID you can place without a bot. We're not designing a world saver so it doesn't matter that someone with a bot could place ID 2,147,483,647 and mess up the encoding.
One bot to rule them all, one bot to find them. One bot to bring them all... and with this cliché blind them.
Offline
Offline
Sounds like it might be fun
Count me in (probably)
Edit: Also we should probably set a limit on the number of different characters allowed to be used, I guess 256?
Edit 2: Actually I started making a list of characters and thats a lot , maybe 128?
Offline
I was once writing a world saving algorithm for EE CM that would save the blocks in a string, using 2 base-64 digits per block (4096 block IDs), which in a string using UTF-8 (1 byte) characters, equals to 2 bytes per block. It included all blocks, even empty ones (ID 0).
It's ridiculously inefficient for the common worlds, which is why I'm not sharing it here, but helps saving big filled world using low space, which is what it was intended for.
Offline
Rules:
1. The string must contain no whitespace - no spaces, no tabs, no newlines; just black on white with nothing in between;
3. The string must work on all/most/many computers - i.e. don't design a Base-512 encoding with lots of Chinese characters that other players can't even render. None of this: ▯▯▯▯
Why does it have to be a string? Why aren't we allowed to store in our own binary form?
Warning!
This user has been found guilty by The Committee of Truth of using honesty, and reminding people of the past, without permission and outside of the allotted timeframes.
I’ve been asked if I’m ChatGPT5.
The answer is no.
I hope this helps! Let me know if you have any other questions.
Everybody edits, but some edit more than others
Offline
Why does it have to be a string? Why aren't we allowed to store in our own binary form?
I'm probably going to be converting it to binary then just use as many characters as are allowed to represent it
Offline
If you can't take part because you don't know how to work with the mapdata in "init", try this post.
Ignore the indexes - it's from an earlier version of EE.
Doesn't that mean I can't exactly make code that works with it?
suddenly random sig change
Offline
Why does it have to be a string? Why aren't we allowed to store in our own binary form?
If by "binary form" you mean a sequence of 1s and 0s, that's not gonna be particularly compact is it? By "string" I just mean a sequence of characters.
And if you're asking why that: so the whole thing can be easily copy/pasted as per Zoey's post.
Doesn't that mean I can't exactly make code that works with it?
That deserialiser is fine for any version of EE. The indexes just shift when more variables are added into "init" (before "ws" ofc) - that does nothing to the format of the mapdata itself, and doesn't affect the code either.
I'm probably going to be converting it to binary then just use as many characters as are allowed to represent it
I think you can get more creative if you start with something alphanumeric (plus additional symbols) rather than binary, because you can immediately begin designating special conditions such as a default empty world. The more special cases, patterns etc. that you add, the more work you pass to the decoder and so the shorter your world string becomes.
One bot to rule them all, one bot to find them. One bot to bring them all... and with this cliché blind them.
Offline
Why are spaces banned? They're a character just like everything else.
I'd say to allow 256 different chars. (which ones you pick doesnt really matter, it's gonna be the same length either way - though i think these ones work p well)
I don't know C#, have some basic ideas for how to get it to work tho
suddenly random sig change
Offline
You could probably do that with binary too, it would just be a bit harder to understand. My idea was to have a few 'layers' of encoding, so I can use any data type needed, then just feed it into something else that converts that to binary, and then that into a character stream with 1 character for every 7 bits or whatever.
Offline
Zumza wrote:Why does it have to be a string? Why aren't we allowed to store in our own binary form?
If by "binary form" you mean a sequence of 1s and 0s, that's not gonna be particularly compact is it? By "string" I just mean a sequence of characters.
A character is still a sequence of 1s and 0s...
If we take base64, well, it allows you to store 6 bits per byte, because there's a padding involved to make a random byte actually a valid ASCII character.
If I would store in a binary format I could use all the 8 bits of a byte, this type of file obviously won't open with any text-editor since it won't match any known encoding, but it could still be edited and viewed with a hex-editor.
And if you're asking why that: so the whole thing can be easily copy/pasted as per Zoey's post.
Thanks.
Warning!
This user has been found guilty by The Committee of Truth of using honesty, and reminding people of the past, without permission and outside of the allotted timeframes.
I’ve been asked if I’m ChatGPT5.
The answer is no.
I hope this helps! Let me know if you have any other questions.
Everybody edits, but some edit more than others
Offline
2. The string must contain the world size;
why who?
if i store all blocks of the world i will be able to get worldsize from there. Why would i need to store worldsize individually?
Offline
why who?
if i store all blocks of the world i will be able to get worldsize from there. Why would i need to store worldsize individually?
Thanks to: Ernesdo (Current Avatar), Zoey2070 (Signature)
Very inactive, maybe in the future, idk.
Offline
so i can come up with a basic system that has good worst-case and terrible average/best-case. (off the top of my head, 11/8 chars per block best and 140+21/8 chars per max length sign)
i can also come up with one that can make your 4 specified worlds with a total of 1 char each while following the rest of the rules. (totally a loophole in your system!!)
suddenly random sig change
Offline
if you don't add the 0 id blocks for compression. I could easily make a 25x25 world in a 400x200 (without borders) world and ya'll gonna think it's a 25x25 afaik
what i want to create an algorithm that doesn't need worldsize?
for example divide everything into 5x5 chunks and then to get the world size just do chunkscount * 5
This is a lazy example, but if you were to create a better one - don't need world size
Offline
Vinyl Melody wrote:if you don't add the 0 id blocks for compression. I could easily make a 25x25 world in a 400x200 (without borders) world and ya'll gonna think it's a 25x25 afaik
what i want to create an algorithm that doesn't need worldsize?
for example divide everything into 5x5 chunks and then to get the world size just do chunkscount * 5
This is a lazy example, but if you were to create a better one - don't need world size
If it contains information to find the world size (that works every time), I'd say that that counts as containing the world size
Offline
Put it this way: the decoder must always be able to quickly obtain the size of the world from the world string.
Ik I said in the assumptions that we're not building a world saver, but if the best encoding was eventually used in one, the bot would know before starting to draw if you tried to upload a level into a map too small for it.
i can also come up with one that can make your 4 specified worlds with a total of 1 char each while following the rest of the rules. (totally a loophole in your system!!)
Well then your world string is technically the data that each of those 4 chars represent, and there are far too few characters to assign one to every possible map so your encoding wouldn't win for a different or a huge set of levels.
Maybe later I'll add more worlds for testing, but for the time being the present ones are sufficiently varied for a fair test.
But you raise an interesting point; just how much data can you offload to the decoder before it's "cheating"? How about this: if you start to code groups of (hardcoded and/or offset-able) coordinates into your decoder, that data really belongs in the world string. Loops and other algorithms are fine.
One bot to rule them all, one bot to find them. One bot to bring them all... and with this cliché blind them.
Offline
Done.
1. Empty world: PWW9KtOqWFcEI
Serialized: 7 bytes.
2. 200 lava minigames: PWKK8zFHH8bEI
Serialized: 7 bytes.
3. Tutorial #1: PWL17t1R6bbUI
Serialized: 6 bytes.
4. Most of the blocks: PWtVBnyNUobEI
Serialized: 9 byte.
Encode / decode takes less than 0.1s.
Obviously, any other world besides the ones you provided uses more characters (since its not hardcoded into the bot), but it can indeed encode and decode ANY block and ANY world.
I have never thought of programming for reputation and honor. What I have in my heart must come out. That is the reason why I code.
Offline
Done.
1. Empty world: PWW9KtOqWFcEI
Serialized: 7 bytes.
▼Hidden text2. 200 lava minigames: PWKK8zFHH8bEI
Serialized: 7 bytes.▼Hidden text3. Tutorial #1: PWL17t1R6bbUI
Serialized: 6 bytes.▼Hidden text4. Most of the blocks: PWtVBnyNUobEI
Serialized: 9 byte.▼Hidden textEncode / decode takes less than 0.1s.
Obviously, any other world besides the ones you provided uses more characters (since its not hardcoded into the bot), but it can indeed encode and decode ANY block and ANY world.
How about this: if you start to code groups of (hardcoded and/or offset-able) coordinates into your decoder, that data really belongs in the world string. Loops and other algorithms are fine.
suddenly random sig change
Offline
Ah. I read every post but the last one when I posted
Well doesn't his statement contradict this:
In theory, an empty world (or a world with just the default border) should generate the shortest world string. (Hint: do you even need to encode the blocks into the string for a world like this??)
The world borders are a hard coded series of blocks...
I have never thought of programming for reputation and honor. What I have in my heart must come out. That is the reason why I code.
Offline
Serious entry this time.
This has 0 blocks hard-coded, the empty world is all 0 fg and bgs. In my opinion, hard-coding the world borders / making a special "empty world" output in your format is a type of optimization that is just gaming the judging method, so I will refrain from making such an opimization.
1. Empty world: PWW9KtOqWFcEI
Serialized: 52 bytes.
2. 200 lava minigames: PWKK8zFHH8bEI
Serialized: 32762 bytes.
3. Tutorial #1: PWL17t1R6bbUI
Serialized: 16038 bytes.
4. Most of the blocks: PWtVBnyNUobEI
Serialized: 14610 byte.
Encode / decode takes less than 3s.
I have never thought of programming for reputation and honor. What I have in my heart must come out. That is the reason why I code.
Offline
I think we should at least describe the solution if not posting the algorithm.
Warning!
This user has been found guilty by The Committee of Truth of using honesty, and reminding people of the past, without permission and outside of the allotted timeframes.
I’ve been asked if I’m ChatGPT5.
The answer is no.
I hope this helps! Let me know if you have any other questions.
Everybody edits, but some edit more than others
Offline
In my opinion, hard-coding the world borders / making a special "empty world" output in your format is a type of optimization that is just gaming the judging method, so I will refrain from making such an opimization.
That may be taking advantage of the judging method for the four worlds provided, but what if the testing comprised thousands of random levels? Empty worlds and empty grey-bordered worlds would be the only maps repeated over and over, and that's the first redundancy that could be removed. The world strings would look something like:
Empty world: "100x100"
Empty world with border: "100x100b"
I'm surprised that your encoder generated a longer string for the tutorial world than for "Most of the blocks".
There's optimisations that could be made in your strings by replacing frequently-occurring substrings with single characters. For instance, replacing "Ookak" with | in the '200 lava minigames' string would save 4,368 characters, shortening the string by a whole 13.33%. I'd say it's perfectly legal to run some substring analysis to eliminate redundancies; you just need more characters. Áéíóú and the like should work on most PCs.
One bot to rule them all, one bot to find them. One bot to bring them all... and with this cliché blind them.
Offline
Let's stick to ASCII.
Warning!
This user has been found guilty by The Committee of Truth of using honesty, and reminding people of the past, without permission and outside of the allotted timeframes.
I’ve been asked if I’m ChatGPT5.
The answer is no.
I hope this helps! Let me know if you have any other questions.
Everybody edits, but some edit more than others
Offline
I've been planning for mine to work with ANSI, which is basically an extended version of ASCII that is the default file type of most text editors, so should be supported on almost all computers
Edit: Some sort of character count rule should probably be included, because otherwise you can basically always improve your encoding by just using another one or two characters, I would suggest either ASCII or ANSI (well the Windows thing called ANSI), or maybe just a max number of different characters allowed to be used
Offline
[ Started around 1747560034.8276 - Generated in 0.292 seconds, 12 queries executed - Memory usage: 2.08 MiB (Peak: 2.57 MiB) ]