Register - Login
Views: 93406480
Main - Memberlist - Active users - Calendar - Wiki - IRC Chat - Online users
Ranks - Rules/FAQ - Stats - Latest Posts - Color Chart - Smilies
07-22-18 09:17:04 PM
0 users currently in Game Development/Mod Projects. | 1 guest

Jul - Game Development/Mod Projects - Datamijn
Login Info: Username: Password:
Reply: Mood avatar list:

Options: - -

Thread history
Posts: 21/22
I'm all for more utilities like QuickBMS. Exploding binary data into readable (and processable) text data is invaluable in reverse-engineering and this seems like a pretty darn good thing for doing that.
Posts: 1717/1729
I'm working on an awesome binary data description language and YOU get to be a part of it!

Datamijn is primarily a domain-specific language for describing binaries. It's meant to be as concise as possible. You know what the data looks like and you just want YAML out of it. Datamijn won't make you write any boilerplate.

A datamijn definition file is natural to write and read. Example:

position {
x u8
y u8

_start {
version u16
positions [8]position

In this .dm file, we describe a type, "coords", consisting of two bytes. We then describe how to parse the binary from the beginning.

Is position only used once? No need to pollute your namespace. In datamijn, you can always define a type anonymously.

version u16
positions [8]{
x u8
y u8

In this example, we haven't defined _start, so datamijn starts parsing the file from the first field.

A binary file parsed with such a definition might end up looking like this pleasant YAML.

version: 1
- x: 5
y: 6
- x: 26
y: 12

What can datamijn actually do?

Here's a more complex example in which I'm parsing a bunch of data from Final Fantasy 1... including strings!


string_ptr is of particular note: it involves reading a NES pointer, doing some calculations to convert it into an absolute pointer within the NES ROM, and then reading a zero-terminated string at that location. The idea is that you won't even have to write this pointer math: nesptr should be a part of the standard library!

The end goal is that you'll describe a ROM according to what you know... and get all of the data out of it, without having write boilerplate or possibly even touching a programming language. Of course, this is all a lie, because if you'll be decompressing some data using datamijn, you'll probably have to write a for loop or two. But the idea is it should be all super concise and intuitive.

Support for writing is also on the roadmap, but only as a second class citizen. It's likely there will be some limitations.

Datamijn isn't complete and I won't be releasing it until I deem it usable, but it's already quite powerful and evolves with my own needs.

Is datamijn interesting to you? Do you have any questions or ideas? I'll probably follow this up with some thoughts on how to do decompression, but this should be enough for now.
Jul - Game Development/Mod Projects - Datamijn

Rusted Logic

Acmlmboard - commit 5d36857 [2018-03-03]
©2000-2018 Acmlm, Xkeeper, Inuyasha, et al.

18 database queries.
Query execution time: 0.137415 seconds
Script execution time: 0.003490 seconds
Total render time: 0.140906 seconds
Memory used: 524288