Register - Login
Views: 95792339
Main - Memberlist - Active users - Calendar - Wiki - IRC Chat - Online users
Ranks - Rules/FAQ - Stats - Latest Posts - Color Chart - Smilies
11-17-18 11:05:37 PM

Jul - Game Development/Mod Projects - Datamijn New poll - New thread - New reply
Next newer thread | Next older thread
Level: 78

Posts: 1717/1761
EXP: 4215970
For next: 166256

Since: 12-20-09

Pronouns: any
From: Czechia (NEW!)

Since last post: 5 days
Last activity: 14 hours

Posted on 05-04-18 05:07:22 AM Link | Quote
I'm working on an awesome binary data description language and YOU get to be a part of it!

Datamijn is primarily a domain-specific language for describing binaries. It's meant to be as concise as possible. You know what the data looks like and you just want YAML out of it. Datamijn won't make you write any boilerplate.

A datamijn definition file is natural to write and read. Example:

position {
x u8
y u8

_start {
version u16
positions [8]position

In this .dm file, we describe a type, "coords", consisting of two bytes. We then describe how to parse the binary from the beginning.

Is position only used once? No need to pollute your namespace. In datamijn, you can always define a type anonymously.

version u16
positions [8]{
x u8
y u8

In this example, we haven't defined _start, so datamijn starts parsing the file from the first field.

A binary file parsed with such a definition might end up looking like this pleasant YAML.

version: 1
- x: 5
y: 6
- x: 26
y: 12

What can datamijn actually do?

Here's a more complex example in which I'm parsing a bunch of data from Final Fantasy 1... including strings!


string_ptr is of particular note: it involves reading a NES pointer, doing some calculations to convert it into an absolute pointer within the NES ROM, and then reading a zero-terminated string at that location. The idea is that you won't even have to write this pointer math: nesptr should be a part of the standard library!

The end goal is that you'll describe a ROM according to what you know... and get all of the data out of it, without having write boilerplate or possibly even touching a programming language. Of course, this is all a lie, because if you'll be decompressing some data using datamijn, you'll probably have to write a for loop or two. But the idea is it should be all super concise and intuitive.

Support for writing is also on the roadmap, but only as a second class citizen. It's likely there will be some limitations.

Datamijn isn't complete and I won't be releasing it until I deem it usable, but it's already quite powerful and evolves with my own needs.

Is datamijn interesting to you? Do you have any questions or ideas? I'll probably follow this up with some thoughts on how to do decompression, but this should be enough for now.

Level: 11

Posts: 21/44
EXP: 5040
For next: 945

Since: 01-23-18

Pronouns: they/she
From: hell

Since last post: 7 days
Last activity: 10 hours

Posted on 07-06-18 03:56:39 AM Link | Quote
I'm all for more utilities like QuickBMS. Exploding binary data into readable (and processable) text data is invaluable in reverse-engineering and this seems like a pretty darn good thing for doing that.
Next newer thread | Next older thread
Jul - Game Development/Mod Projects - Datamijn New poll - New thread - New reply

Rusted Logic

Acmlmboard - commit 220d144 [2018-11-04]
©2000-2018 Acmlm, Xkeeper, Inuyasha, et al.

27 database queries.
Query execution time: 0.164033 seconds
Script execution time: 0.006725 seconds
Total render time: 0.170758 seconds