Official Everybody Edits Forums

Do you think I could just leave this part blank and it'd be okay? We're just going to replace the whole thing with a header image anyway, right?

You are not logged in.

#1 2017-11-11 20:34:02, last edited by Tomahawk (2017-11-12 01:57:14)

Tomahawk
Forum Mod
From: UK
Joined: 2015-02-18
Posts: 2,848

[Challenge] World string

Ohai. This subforum seemed kinda dead, so I saw Zoey's post about a world string and thought it might be an interesting StackOverflow-esque challenge, if anyone's still interested in bots.

Task:
Design a serialiser (a world-saver encoding) that can store all the blocks in a world in one line, in as few characters as possible.

If you can code, write the code that would encode and then decode + draw that world string. Don't post it yet though, so people don't copy.

Rules:
1. The string must contain no whitespace - no spaces, no tabs, no newlines; just black on white with nothing in between;
2. The string must contain the world size;
3. The string must work on all/most/many computers - i.e. don't design a Base-512 encoding with lots of Chinese characters that other players can't even render. None of this: ▯▯▯▯
4. The encoding must ofc work on all world sizes;
5. The encoding must work for all blocks. Switches, world portals, signs, mod text, everything;
6. You must be able to decode the string in a realistic amount of time. Say 3 mins tops.

Testing:
The following levels have been chosen to provide fair and unfair representations of EE worlds. The best encoding is the one that requires the fewest total characters for these worlds.

1. Empty world: PWW9KtOqWFcEI

Image

In theory, an empty world (or a world with just the default border) should generate the shortest world string. (Hint: do you even need to encode the blocks into the string for a world like this??)

2. 200 lava minigames: PWKK8zFHH8bEI

Image

A popular minigame level with lots of repeated blocks. (Hint: can you optimise the world string for a world with few special blocks?)

3. Tutorial #1: PWL17t1R6bbUI

Image

It's not the original, but it's another opportunity to really compress that string. And it's got mod text!

4. Most of the blocks: PWtVBnyNUobEI

Image

You have no idea how long this took to make. It might not have every single block, but it'll make one helluva world string. Probably the worst-case scenario in a challenge like this.

Notes:
A good encoding must compress the data while eliminating as much redundancy as possible. The baseline for comparison, I suppose, would be EE's own serialiser - if it were shoehorned into one line and the whitespace removed. Obviously it's the serialised "init" data that you'll be compressing (or the BigDB data if you're really showing off), so a world string that's longer than that would be pretty unimpressive.

Can you do better?

Post your world strings below with their lengths. If you post all 4 together they must be generated using the same code. You don't have to do encode all 4 worlds at once, but you should give a general overview (without giving too much away) of how your encoding works.

Feel free to discuss different ways of optimising world strings, such as:
- Using specific characters to indicate patterns, scenarios, block-sets etc;
- Storing lines or groups of blocks as well as individual blocks;
- Storing the inverse of some block IDs - e.g. start with BG everywhere and then remove some of it.

Deadline: EE's death.
Prizes: none / the satisfaction of showing you're the best / gems if someone else provides them. It's a challenge, not a competition.
Judges: meh. let the length of the string be the judge.
Thanks to: Therealrein
If you can't take part because you don't know how to work with the mapdata in "init", try this post.

Assumptions:
1. The largest portal ID you can expect is 99,999 - this is the largest ID you can place without a bot. We're not designing a world saver so it doesn't matter that someone with a bot could place ID 2,147,483,647 and mess up the encoding.


One bot to rule them all, one bot to find them. One bot to bring them all... and with this cliché blind them.

Offline

#2 2017-11-11 20:38:39

hummerz5
Member
From: wait I'm not a secret mod huh
Joined: 2015-08-10
Posts: 5,853

Re: [Challenge] World string

I feel like the baseline to compare to would be taking Atilla's world saving format (I think that exists) and just converting the binary file data to Base64. We should shoot to do better than that eh?

Offline

Wooted by:

#3 2017-11-11 21:00:41, last edited by LukeM (2017-11-11 21:09:51)

LukeM
Member
From: England
Joined: 2016-06-03
Posts: 3,009
Website

Re: [Challenge] World string

Sounds like it might be fun //forums.everybodyedits.com/img/smilies/big_smile

Count me in (probably)

Edit: Also we should probably set a limit on the number of different characters allowed to be used, I guess 256?

Edit 2: Actually I started making a list of characters and thats a lot //forums.everybodyedits.com/img/smilies/tongue, maybe 128?

Offline

Wooted by:

#4 2017-11-11 22:39:07, last edited by drunkbnu (2017-11-11 22:45:51)

drunkbnu
Formerly HG
Joined: 2017-08-16
Posts: 2,306

Re: [Challenge] World string

I was once writing a world saving algorithm for EE CM that would save the blocks in a string, using 2 base-64 digits per block (4096 block IDs), which in a string using UTF-8 (1 byte) characters, equals to 2 bytes per block. It included all blocks, even empty ones (ID 0).

It's ridiculously inefficient for the common worlds, which is why I'm not sharing it here, but helps saving big filled world using low space, which is what it was intended for.

Offline

#5 2017-11-11 22:49:24

Zumza
Member
From: root
Joined: 2015-02-17
Posts: 4,656

Re: [Challenge] World string

Tomahawk wrote:

Rules:
1. The string must contain no whitespace - no spaces, no tabs, no newlines; just black on white with nothing in between;
3. The string must work on all/most/many computers - i.e. don't design a Base-512 encoding with lots of Chinese characters that other players can't even render. None of this: ▯▯▯▯

Why does it have to be a string? Why aren't we allowed to store in our own binary form?


Everybody edits, but some edit more than others

Offline

#6 2017-11-11 23:19:48

LukeM
Member
From: England
Joined: 2016-06-03
Posts: 3,009
Website

Re: [Challenge] World string

Zumza wrote:

Why does it have to be a string? Why aren't we allowed to store in our own binary form?

I'm probably going to be converting it to binary then just use as many characters as are allowed to represent it

Offline

#7 2017-11-11 23:37:01

Slabdrill
Formerly 12345678908642
From: canada
Joined: 2015-08-15
Posts: 3,402
Website

Re: [Challenge] World string

Tomahawk wrote:

If you can't take part because you don't know how to work with the mapdata in "init", try this post.

Ignore the indexes - it's from an earlier version of EE.

Doesn't that mean I can't exactly make code that works with it?


suddenly random sig change

Offline

#8 2017-11-12 01:26:27

Tomahawk
Forum Mod
From: UK
Joined: 2015-02-18
Posts: 2,848

Re: [Challenge] World string

Zumza wrote:

Why does it have to be a string? Why aren't we allowed to store in our own binary form?

If by "binary form" you mean a sequence of 1s and 0s, that's not gonna be particularly compact is it? By "string" I just mean a sequence of characters.

And if you're asking why that: so the whole thing can be easily copy/pasted as per Zoey's post.

Slabdrill wrote:

Doesn't that mean I can't exactly make code that works with it?

That deserialiser is fine for any version of EE. The indexes just shift when more variables are added into "init" (before "ws" ofc) - that does nothing to the format of the mapdata itself, and doesn't affect the code either.

destroyer123 wrote:

I'm probably going to be converting it to binary then just use as many characters as are allowed to represent it

I think you can get more creative if you start with something alphanumeric (plus additional symbols) rather than binary, because you can immediately begin designating special conditions such as a default empty world. The more special cases, patterns etc. that you add, the more work you pass to the decoder and so the shorter your world string becomes.


One bot to rule them all, one bot to find them. One bot to bring them all... and with this cliché blind them.

Offline

#9 2017-11-12 01:43:51, last edited by Slabdrill (2017-11-12 01:47:04)

Slabdrill
Formerly 12345678908642
From: canada
Joined: 2015-08-15
Posts: 3,402
Website

Re: [Challenge] World string

Why are spaces banned? They're a character just like everything else.
I'd say to allow 256 different chars. (which ones you pick doesnt really matter, it's gonna be the same length either way - though i think these ones work p well)

I don't know C#, have some basic ideas for how to get it to work tho


suddenly random sig change

Offline

#10 2017-11-12 01:48:12

LukeM
Member
From: England
Joined: 2016-06-03
Posts: 3,009
Website

Re: [Challenge] World string

You could probably do that with binary too, it would just be a bit harder to understand. My idea was to have a few 'layers' of encoding, so I can use any data type needed, then just feed it into something else that converts that to binary, and then that into a character stream with 1 character for every 7 bits or whatever.

Offline

#11 2017-11-12 12:54:49

Zumza
Member
From: root
Joined: 2015-02-17
Posts: 4,656

Re: [Challenge] World string

Tomahawk wrote:
Zumza wrote:

Why does it have to be a string? Why aren't we allowed to store in our own binary form?

If by "binary form" you mean a sequence of 1s and 0s, that's not gonna be particularly compact is it? By "string" I just mean a sequence of characters.

A character is still a sequence of 1s and 0s...
If we take base64, well, it allows you to store 6 bits per byte, because there's a padding involved to make a random byte actually a valid ASCII character.
If I would store in a binary format I could use all the 8 bits of a byte, this type of file obviously won't open with any text-editor since it won't match any known encoding, but it could still be edited and viewed with a hex-editor.

Tomahawk wrote:

And if you're asking why that: so the whole thing can be easily copy/pasted as per Zoey's post.

Thanks.


Everybody edits, but some edit more than others

Offline

#12 2017-11-12 14:36:22

Gosha
Member
From: Russia
Joined: 2015-03-15
Posts: 6,211

Re: [Challenge] World string

Tomahawk wrote:

2. The string must contain the world size;

why who?
if i store all blocks of the world i will be able to get worldsize from there. Why would i need to store worldsize individually?

Offline

#13 2017-11-12 15:09:38

Vinyl Melody
Formerly BananaMilkShake
Joined: 2016-06-19
Posts: 616

Re: [Challenge] World string

Gosha wrote:

why who?
if i store all blocks of the world i will be able to get worldsize from there. Why would i need to store worldsize individually?

Hidden text

cb0de83627.png
Thanks to: Ernesdo (Current Avatar), Zoey2070 (Signature)

Very inactive, maybe in the future, idk.

Offline

Wooted by: (3)

#14 2017-11-12 16:39:54

Slabdrill
Formerly 12345678908642
From: canada
Joined: 2015-08-15
Posts: 3,402
Website

Re: [Challenge] World string

so i can come up with a basic system that has good worst-case and terrible average/best-case. (off the top of my head, 11/8 chars per block best and 140+21/8 chars per max length sign)

i can also come up with one that can make your 4 specified worlds with a total of 1 char each while following the rest of the rules. (totally a loophole in your system!!)


suddenly random sig change

Offline

#15 2017-11-12 17:10:59

Gosha
Member
From: Russia
Joined: 2015-03-15
Posts: 6,211

Re: [Challenge] World string

Vinyl Melody wrote:

if you don't add the 0 id blocks for compression. I could easily make a 25x25 world in a 400x200 (without borders) world and ya'll gonna think it's a 25x25 afaik

what i want to create an algorithm that doesn't need worldsize?
for example divide everything into 5x5 chunks and then to get the world size just do chunkscount * 5
This is a lazy example, but if you were to create a better one - don't need world size

Offline

#16 2017-11-12 17:13:28

LukeM
Member
From: England
Joined: 2016-06-03
Posts: 3,009
Website

Re: [Challenge] World string

Gosha wrote:
Vinyl Melody wrote:

if you don't add the 0 id blocks for compression. I could easily make a 25x25 world in a 400x200 (without borders) world and ya'll gonna think it's a 25x25 afaik

what i want to create an algorithm that doesn't need worldsize?
for example divide everything into 5x5 chunks and then to get the world size just do chunkscount * 5
This is a lazy example, but if you were to create a better one - don't need world size

If it contains information to find the world size (that works every time), I'd say that that counts as containing the world size

Offline

#17 2017-11-12 18:49:41

Tomahawk
Forum Mod
From: UK
Joined: 2015-02-18
Posts: 2,848

Re: [Challenge] World string

Put it this way: the decoder must always be able to quickly obtain the size of the world from the world string.

Ik I said in the assumptions that we're not building a world saver, but if the best encoding was eventually used in one, the bot would know before starting to draw if you tried to upload a level into a map too small for it.

Slabdrill wrote:

i can also come up with one that can make your 4 specified worlds with a total of 1 char each while following the rest of the rules. (totally a loophole in your system!!)

Well then your world string is technically the data that each of those 4 chars represent, and there are far too few characters to assign one to every possible map so your encoding wouldn't win for a different or a huge set of levels.

Maybe later I'll add more worlds for testing, but for the time being the present ones are sufficiently varied for a fair test.

But you raise an interesting point; just how much data can you offload to the decoder before it's "cheating"? How about this: if you start to code groups of (hardcoded and/or offset-able) coordinates into your decoder, that data really belongs in the world string. Loops and other algorithms are fine.


One bot to rule them all, one bot to find them. One bot to bring them all... and with this cliché blind them.

Offline

#18 2017-11-12 19:40:13, last edited by Processor (2017-11-12 19:40:21)

Processor
Member
Joined: 2015-02-15
Posts: 2,246

Re: [Challenge] World string

Done.

1. Empty world: PWW9KtOqWFcEI

Serialized: 7 bytes.

Hidden text

2. 200 lava minigames: PWKK8zFHH8bEI
Serialized: 7 bytes.

Hidden text

3. Tutorial #1: PWL17t1R6bbUI
Serialized: 6 bytes.

Hidden text

4. Most of the blocks: PWtVBnyNUobEI
Serialized: 9 byte.

Hidden text

Encode / decode takes less than 0.1s.

Obviously, any other world besides the ones you provided uses more characters (since its not hardcoded into the bot), but it can indeed encode and decode ANY block and ANY world.


I have never thought of programming for reputation and honor. What I have in my heart must come out. That is the reason why I code.

Offline

Wooted by: (3)

#19 2017-11-12 19:43:39

Slabdrill
Formerly 12345678908642
From: canada
Joined: 2015-08-15
Posts: 3,402
Website

Re: [Challenge] World string

Processor wrote:

Done.

1. Empty world: PWW9KtOqWFcEI

Serialized: 7 bytes.

Hidden text

2. 200 lava minigames: PWKK8zFHH8bEI
Serialized: 7 bytes.

Hidden text

3. Tutorial #1: PWL17t1R6bbUI
Serialized: 6 bytes.

Hidden text

4. Most of the blocks: PWtVBnyNUobEI
Serialized: 9 byte.

Hidden text

Encode / decode takes less than 0.1s.

Obviously, any other world besides the ones you provided uses more characters (since its not hardcoded into the bot), but it can indeed encode and decode ANY block and ANY world.

How about this: if you start to code groups of (hardcoded and/or offset-able) coordinates into your decoder, that data really belongs in the world string. Loops and other algorithms are fine.


suddenly random sig change

Offline

#20 2017-11-12 20:00:37

Processor
Member
Joined: 2015-02-15
Posts: 2,246

Re: [Challenge] World string

Ah. I read every post but the last one when I posted //forums.everybodyedits.com/img/smilies/big_smile

Well doesn't his statement contradict this:

Tomahawk wrote:

In theory, an empty world (or a world with just the default border) should generate the shortest world string. (Hint: do you even need to encode the blocks into the string for a world like this??)

The world borders are a hard coded series of blocks...


I have never thought of programming for reputation and honor. What I have in my heart must come out. That is the reason why I code.

Offline

#21 2017-11-12 20:31:30

Processor
Member
Joined: 2015-02-15
Posts: 2,246

Re: [Challenge] World string

Serious entry this time.

This has 0 blocks hard-coded, the empty world is all 0 fg and bgs. In my opinion, hard-coding the world borders / making a special "empty world" output in your format is a type of optimization that is just gaming the judging method, so I will refrain from making such an opimization.

1. Empty world: PWW9KtOqWFcEI

Serialized: 52 bytes.

Output
Human-readable form

2. 200 lava minigames: PWKK8zFHH8bEI
Serialized: 32762 bytes.

Output

3. Tutorial #1: PWL17t1R6bbUI
Serialized: 16038 bytes.

Output

4. Most of the blocks: PWtVBnyNUobEI
Serialized: 14610 byte.

Hidden text

Encode / decode takes less than 3s.


I have never thought of programming for reputation and honor. What I have in my heart must come out. That is the reason why I code.

Offline

#22 2017-11-12 23:24:16

Zumza
Member
From: root
Joined: 2015-02-17
Posts: 4,656

Re: [Challenge] World string

I think we should at least describe the solution if not posting the algorithm.


Everybody edits, but some edit more than others

Offline

Wooted by: (2)

#23 2017-11-12 23:27:58

Tomahawk
Forum Mod
From: UK
Joined: 2015-02-18
Posts: 2,848

Re: [Challenge] World string

Processor wrote:

In my opinion, hard-coding the world borders / making a special "empty world" output in your format is a type of optimization that is just gaming the judging method, so I will refrain from making such an opimization.

That may be taking advantage of the judging method for the four worlds provided, but what if the testing comprised thousands of random levels? Empty worlds and empty grey-bordered worlds would be the only maps repeated over and over, and that's the first redundancy that could be removed. The world strings would look something like:

Empty world: "100x100"
Empty world with border: "100x100b"

I'm surprised that your encoder generated a longer string for the tutorial world than for "Most of the blocks".

There's optimisations that could be made in your strings by replacing frequently-occurring substrings with single characters. For instance, replacing "Ookak" with | in the '200 lava minigames' string would save 4,368 characters, shortening the string by a whole 13.33%. I'd say it's perfectly legal to run some substring analysis to eliminate redundancies; you just need more characters. Áéíóú and the like should work on most PCs.


One bot to rule them all, one bot to find them. One bot to bring them all... and with this cliché blind them.

Offline

#24 2017-11-12 23:30:52

Zumza
Member
From: root
Joined: 2015-02-17
Posts: 4,656

Re: [Challenge] World string

Let's stick to ASCII.


Everybody edits, but some edit more than others

Offline

#25 2017-11-13 00:42:23, last edited by LukeM (2017-11-13 00:50:22)

LukeM
Member
From: England
Joined: 2016-06-03
Posts: 3,009
Website

Re: [Challenge] World string

I've been planning for mine to work with ANSI, which is basically an extended version of ASCII that is the default file type of most text editors, so should be supported on almost all computers

Edit: Some sort of character count rule should probably be included, because otherwise you can basically always improve your encoding by just using another one or two characters, I would suggest either ASCII or ANSI (well the Windows thing called ANSI), or maybe just a max number of different characters allowed to be used

Offline

Wooted by:
Tomahawk1547572748738826

Board footer

Powered by FluxBB

[ Started around 1733250583.0609 - Generated in 0.536 seconds, 13 queries executed - Memory usage: 2.06 MiB (Peak: 2.55 MiB) ]