Thursday, June 19, 2014

json-ld (part 2 of 3)

Subject: json-ld (part 2 of 3)

  • note: originally emailed 06/19/2014 and sanitized for public consumption
  • sidenote: posting here so i can lead people here whenever they ask me about this stuff...
  • series: part 1 & part 3

This message is primarily for the TO: peeps. But some of you have asked about json-ld before and this would benefit everyone here for future reference.

While reading up on some (computer) graphics articles -- I found out (in a round-about way) what linked-data (um, actually) the * description languages would be good for.

Not so fast:
First, why I still think it's not needed (in the context of javascript's json handler) -- it's unused information (you'll see this in a sec).
  • For more info on why I don't like json-ld (just like why I don't like VM's) -- it's heavy with a lot of un-needed cruff (you can read more about it from the email thread below).
    • Again, I'm coming from a javascript coder [for a project with restricted and security considerations] point of view...

So, what is it good for?
It (linked-data/interface description language/data format description language/etc.) is good for:
  • binary only data (i.e. structs/class)
    • getting transported between different programming languages
    • which might be running on different architecture (endian-ness, 32 vs 64 bits, etc.)
  • these (the data payload) are quite specific and can lead to catastrophic behavior if not handled properly

Examples of this type of mechanisms are:
  • ZeroC's Ice
  • Apache's Thrift
  • Google's Protocol Buffer
  • etc...

What does all that mean?
Basically, it's good for defining your data structs without (sort of) knowing anything about them.

Here's what I mean. Back to the graphics article I mentioned before. Blender (3D) is an open source modeler that had its beginnings 20 years ago.

The creator of that program made his binary data files with "encoded types for the entire internal structure" -- in other words, self-describing and included in the data file. The intention was to make it backwards compatible knowing that future versions of the program will always require inevitable changes to the data file format.

Why strike linked-data and call out the * description languages
Back to "unused information" I mentioned earlier, [for our system] -- the LD stuff will NEVER be used (I can pretty much say that now with 99.9% certainty).
  • C/C++/java/etc. however, would HUGELY benefit from a self-describing struct/class object.

So when I started to play with:

I said to myself, "this is interesting -- but what is this good for?"" -- I even said ([in part 1]):
  • this terse way of representing all of the data [is] highly error prone if written by humans -- this will need to be generated with tools just to be safe

The playground only showed results for JSON-LD and an RDF item.
  • But, there seems to be a number of possible representation for the same dataset:
  • I thought to myself, there has got to be a better way to create this (remember: highly error prone if written by humans)

Then, the following showed up in [redacted] (today in fact!):
  • [link redacted]
  • They mentioned hive-json
    • This showed how a json ld-ish output was created from (in this case) the apache hive schema

So, out of curiosity, I thought if there were other ways to help write json-ld via converters?

No comments: