![]()
God, a language for good ol' data
Data serialization can be better, without being too much.
{
name = "Will";
age = 26;
married = false;
favorite-movies = [
{
title = "Interstellar";
starring = [
"Matthew McConaughey"
"Jessica Chastain"
"Anne Hathaway"
];
director = "Christopher Nolan";
year = 2014;
}
{
title = "Kill Bill: Volume 1";
director = "Quinten Tarantino";
starring = [
{ actor = "Uma Thurman"; character = "The Bride"; }
{ actor = "Lucy Liu"; character = "O-Ren Ishii"; }
{ actor = "David Carradine"; character = "Bill"; }
];
year = 2003;
}
{
title = "The Witch";
director = "Robert Eggers";
starring = [ "Anya Taylor-Joy" "Ralph Ineson" ];
year = 2015;
}
];
friends = [
{
name = "Floyd";
age = 29;
married = true;
favorite-movies = [
{
title = "The Departed";
starring = [ "Leonardo DiCaprio" "Vera Farmiga" "Matt Daemon" ];
director = "Martin Scorsese";
year = 2006;
}
{
title = "Shutter Island";
starring = [ "Leonardo DiCaprio" "Mark Ruffalo" ];
director = "Martin Scorsese";
year = 2010;
}
];
friends = [];
}
];
}
Why?
As someone who has found themself needing to manually write and programatically work with data serialization formats, I wanted a better way. I tried many formats: JSON, YAML, TOML, CSV, XML, KDL, Lua tables, Java properties, and others. You name it, I tried it. Many of them had enough nagging issues to cause motivation in me to find a better format, which never arose.
"But JSON works fine"
Have you ever been in the position of writing JSON, rather than just having a library parse it? If you haven't; then yes that's a logical conclusion. Personally, I find myself in a position where I need to write data manually, and many of the popular formats make that experience have more friction than it should. For those that may need a data serialization format, but never (or rarely) have to deal directly with the data in its storage format, it may seems like nit-picking; however it becomes different when you find yourself manually writing in these formats.
Background
If you feel that God syntax is familiar, that's probably because it is. God
isn't a new syntax; it is derived directly from the Nix programming language.
Any valid God code can be validated directly by Nix, with nix eval -f file.god. I saw no need
to create a new language when I realized Nix had exactly the bones needed to derive a flexible
(and easy to understand) data serialization format. God is a subset of Nix which omits it's
programming syntax and features in favor of static data representation.
Some of the benefits include:
- It can be validated by
nix - Conversion from GOD to JSON with
nix eval -f file.god --json - A number of existing tools for working with Nix code can be used
If you would like to see some sample document files, see the examples page.
Examples
These can all be found in the example/ directory of the project's Git repository.
simple.god
{
name = "Will";
age = 26;
numbers = [ 9 -45 3.14 ];
special = {
yes = true;
no = false;
none = null;
};
long-string = ''
Hello
there!
'';
}
package.god
{
name = "shepherd";
version = "1.0.5";
licensing = [ "GPL-3.0-or-later" ];
links = {
home = "https://gnu.org/software/shepherd";
repo = "https://codeberg.org/shepherd/shepherd.git";
};
tag = {
release = true;
name = "v1.0.5";
};
foreign = [
"usr/share/doc/shepherd-1.0.5"
"usr/share/guile/site/3.0/shepherd"
"usr/lib/guile/3.0/site-ccache/shepherd"
"usr/libexec/shepherd"
];
}
types.god
{
name = "Will";
nums = [ 1 2 3 true false null "string" ];
mapping = { age = 26; };
yes = true;
no = false;
nothing = null;
things = {
one = true;
zero = false;
nada = null;
list = [ true false null "string" 1 2 3 { map = "self"; catch = 22; lie = true; } ];
};
list-of-maps = [
{
string-with-escapes = "\"\\there should be a single slash at the beginning when interpreted and this would be entirely quoted and\r\n\tindented on a new line here as well.\"";
list-within-map-within-list = [ 1 2 3 true false null "\"escaped quotes\"" ];
}
{
more = "less";
}
];
}
directions.god
{
directions = [
{
name = "north";
cardinal = true;
}
{
name = "east";
cardinal = true;
}
{
name = "west";
cardinal = true;
}
{
name = "south";
cardinal = true;
}
{
name = "down";
cardinal = false;
}
];
}
deep.god
{
user = {
name = "Will";
age = 26;
married = false;
friends = [
{
name = "Floyd";
age = 29;
married = true;
favorite-numbers = [ 1 2 -3.14 false null true "Hello!" 69 ];
qualities = {
emotional = [ "patient" 1 "nice" null ];
};
}
];
};
}
complex.god
{
name = "Will";
age = 26;
married = false;
favorite-movies = [
{
title = "Interstellar";
director = "Christopher Nolan";
}
{
title = "Kill Bill Volume 1";
director = "Quinten Tarantino";
}
];
friends = [
{
name = "Floyd";
age = 29;
married = false;
favorite-movies = [
{
title = "Training Day";
director = null;
}
{
title = "The Departed";
director = "Martin Scorcese";
}
];
friends = [];
}
];
}
string-escapes.god
{
string = "normal string";
special-strings = [
"\"\\entirely quoted with a single slash at the start and\r\n\tnewline + indent here.\""
"\" \\ this should quoted with slashes on both sides \\ \""
"\\tabs\t\\and\t\\slashes\t\\with\t\\every\t\\word."
"\nline-feeds above and below\n"
"\r\ncarriage-return/line-feeds above and below\r\n"
"\rcarriage-returns on both sides\r"
];
}
Implementations
- Guile Scheme: wreedb/guile-god
- Tree-Sitter Grammar: wreedb/tree-sitter-god
If you have decided to implement the language, please contact me!
Value Types
File Structure
Values
The value types in GOD are intentionally rudimentary, with the goal of being useful to almost any programming language. They are flexible and have few restrictions.
Index
Strings
Regular
A standard or regular string is represented by a pair of double quotes with any amount of text inside it.
greeting = "Hello, how are you?";
Multi-line
Strings that span across multiple lines are supported. They are define using two sets of single quotes; one at the beginning and one at the end.
about-me = ''
Let me tell you
about myself!
'';
Multi-line string are also indentation aware; The indentation of the contained string is calculated relative to the furthest left column which contains meaningful (non-whitespace) text.
my-string = ''
There are four spaces before this,
but the they will not be preserved.
'';
# produces:
# "There are four spaces before this,\n but they will not be preserved.\n"
Escaping
Yog can escape (double) quotes in a regular string using a backslash (\) before it. This is the same
for line-feeds, carriage returns and tab characters (\n,\r,\t).
To escape any character, prefix it with ''\
height = "6'2\"\n";
# produces:
# 6'2"\n
greeting = ''
I said ''\'Hello!''\'
to them.
'';
# produces:
# "I said 'Hello!'"\n to them."
The whitespace and newline on the opening line after '' are ignored if there
is no meaningful (non-whitespace) text or characters on said initial line.
Also, leading tab (\t) characters are not stripped from the beginning of the
line, so it is best practice to use spaces within multiline strings unless
this is desired.
Numbers
Numbers in God are neither specifically intergers, doubles nor floats.
They can be any of them; In Nix, integers have an upper and lower boundary
of 9223372036854775807 and -9223372036854775807 respectively, as they are
two's complement signed integers. God mimics this behavior, albeit
slightly simpler due to all data here being completely static.
age = 26;
age-negative = -26;
pi = 3.14;
pi-negative = -3.14;
exp = 0.27e13;
In practice, a number is a sequence of one or many numeric digits, it may be used with a leading negation operator, and may use lower and upper-case exponent notation.
decimal-exp = -.5e10;
decimal-negative-exp = -.123E-5;
Not all programming languages can represent these limits effectively; therefore the implementer should document any deviations from these limits clearly for their users.
If the technical details needed for proper usage are not documented by the implementation, the implementation MAY NOT claim to be compliant with this specification.
Booleans
Boolean values in God are written as the unquoted text true and false.
As in almost any context related to computer science, booleans are a data type
used to describe something that has one of two possible values; most commonly
true/false and 1/0.
yes = true;
no = false;
Some languages may have unconventional boolean data types, and therefore the implementer may want to use the closest analogue in their language, for example, in Emacs Lisp:
(setq foo t)
(setq bar nil)
There is not false boolean type in the language. There is t to represent a
truthy value, and the nil keyword is often used in place of a falsy
value.
Due to the permissiveness of identifiers in God (as well as in Nix), it
is completely valid to use true, false and even null as identifiers.
true = false;
false = null;
Using identifiers that use the same name as built-in language types is highly discouraged for obvious reasons, however they are documented here for the purposes of completeness.
Null
Written as the unquoted text null, this value represents nothing.
This may correlate to a languages' null value, or represent the absence of a
value in languages which do not have a null type. In some languages this may
be best represented by 0 or an empty list ('()) as in some lisp style languages.
nothing = null;
It is at the discretion of the implementer to decide what makes the most sense for their use case and the capabilities of their language.
As described in booleans, null is able to be used as an identifier,
though highly discouraged.
Maps
A map in God is a data structure that has an analogue in practically every programming language; Lua tables, Python dictionaries, Perl hash slices, (Type/Java)script objects, Lisp and Scheme association lists, and Java HashMaps just to name a few.
The commonality is the structure of an identifier which is assigned a group of fields. Fields can be seen as key-value pairs in an abstract sense.
In God specifically, they are delimited by opening and closing "curly"
braces ({ }). When they are used as just a field, they must be use a
termination operator; when used as an element, only any fields within it will
require termination.
me = {
name = "Will";
age = 26;
married = false;
favorite-songs = [
{ artist = "Slint"; title = "Nosferatu Man"; }
{ artist = "OutKast"; title = "Hey Ya!"; }
];
best-friend = {
name = "Floyd";
age = 29;
married = true;
favorite-songs = [
{ artist = "Tool"; title = "Lateralus"; }
{ artist = "Deafheaven"; title = "Dream House"; }
];
};
};
Any type of field is allowed within a map, so long as it follows any necessary rules of field termination.
Some languages allow identifiers to be used more than once within a map or similar data structure, with the last occurrence determining its value. This is not valid in God.
me = {
name = "Will";
age = 26;
age = 25; # this is an ERROR
};
Lists
These are a grouping of any number of elements. They may be the
element assigned to a field, or nested within another list as one of its'
elements. They are delimited by opening and closing "square" brackets ([ ]).
The elements contained are separated by whitespace.
They have no strict type requirements for their elements, meaning it can hold numbers, strings, maps and other lists.
favorite-movies = [
"Interstellar"
"The Witch"
"Kill Bill Vol. 1"
];
list-of-lists = [
[ 1 2 3 ]
[ true false null ]
[ "foo" "bar" ]
[
{ name = "Will"; age = 26; }
{ name = "Floyd"; age = 29; }
]
];
Structure
Document
The document is used to describe the starting and ending boundaries of data
in a God file. It is delimited by opening and closing "curly" braces
({ }). There can only be one set of document-level delimiters in a God file.
{ # document begins here
name = "Will";
hobbies = [ "Programming" "Watching Movies" "Playing Video Games" ];
} # document ends here
Within a document, any number of fields are allowed in any order at any level of depth.
The document is not a field, therefore it does not use a field termination operator.
As stated above, only one document-level set of delimiters are allowed, meaning the following example is invalid:
{
name = "Will";
}
{
age = 26;
}
Operators
The following symbols are operators in God:
Assignment
This is the operator that denotes an identifier being assigned a value. It is written as an equal sign.
thing = "value";
Termination
These are the integral final part of a field; distinguishing where a given field ends. They are are required to terminate all fields. They are written as a semi-colon.
numbers = { one = 1; two = 2; three = 3; };
Negation
These are used as a prefix to number values; denoting the negativity of the number in question.
negative-numbers = [ -1 -2 -3 -3.14 ];
positive-numbers = [ 1 2 3 3.14 ];
Elements
An element is a raw value, either as a the assigned value of a field, or an item in a list. They can be any of the fundamental data types.
things = [
"string"
1
null
false
''
Multi-line
string
''
[ "another" "list" ]
{ text = "another map"; }
];
In this example, the following parts are elements:
"string"1null'' ... Multi-line string ... ''(abbreviated)[ "another" "list" ]"another""list"
{ text = "a map"; }"a map"
Identifiers
These are unquoted text denoting the name or identity of a field.
Identifiers may not contain any of the following standard symbols:
.: period%: percent sign$: dollar sign, as well as other localized currency symbols@: at sign!: exclamation point^: caret&: ampersand*: asterisk": double-quote`: backtick/grave~: tilde[]: opening and closing square brackets{}: opening and closing curly braces(): opening and closing parenthesis,: comma+: plus/addition sign=: equal sign><: angle brackets (greater/lesser than signs)?: question mark\/: back and forward slashes;: semi-colon
Identifiers may not begin with:
-: hypen': single-quote0-9: numeric digits
They however, may contain and be suffixed by:
-: hyphen': unpaired single-quote_: underscore
Outside of these limits, any amount of digits (0-9), and letters (A-Z, a-z) are allowed.
# containing hyphens/underscores
abc-123 = "fa so la ti do";
abc_123 = null;
# suffixed by hyphens/underscores
abc-123- = "fa so la ti do";
abc_123_ = null;
# highly impractical, but valid
a'b'c'1'2'3 = "do re mi";
a_-_b-'_'-c'1_2-'3' = { crazy = true; };
The use of non-ASCII Unicode characters (emojis, non-Latin characters, accented characters, etc.) in identifiers is invalid. While support for non-English languages would be desirable, implementation difficulty, security concerns, and lack of expertise with non-Latin scripts make this a strict limitation for the foreseeable future.
Fields
These are the most common and fundamental pieces of data in the language.
They are a combination of four parts:
- identifier: the name of the field
- assignment operator: to assign the name to a value
- element: the fields' assigned value
- termination: to end the field declaration
name = "Will";
favorite-things = [ "Movies" "Programming" ];
favorite-movie = {
title = "Interstellar";
director = "Christopher Nolan";
release-year = 2014;
leading-roles = [
{
character = "Joseph Cooper";
actor = "Matthew McConaughey";
}
{
character = "Dr. Amelia Brand";
actor = "Anne Hathaway";
}
];
};
In this example, there are 11 fields in total. Let's break down the biggest one:
favorite-movie = {
title = "Interstellar";
director = "Christopher Nolan";
release-year = 2014;
leading-roles = [
{
character = "Joseph Cooper";
actor = "Matthew McConaughey";
}
{
character = "Dr. Amelia Brand";
actor = "Anne Hathaway";
}
];
};
Here, there are nine total fields including the favorite-movies map.
Within it, there are eight fields:
| identifier | assigned value |
|---|---|
| title | string "Interstellar" |
| director | string "Christopher Nolan" |
| release-year | number 2014 |
| leading-roles | an list containing two maps as elements |
The first element of leading-roles:
| identifier | assigned value |
|---|---|
| character | string "Joseph Cooper" |
| actor | string "Matthew McConaughey" |
The second element of leading-roles
| identifier | assigned value |
|---|---|
| character | string "Dr. Amelia Brand" |
| actor | string "Anne Hathaway" |
It is an important distiction that the two elements within the leading-roles list are NOT fields; they are elements that contain fields.
Whitespace
The following are considered whitespace in God:
| name | value |
|---|---|
| space characters | \x20 |
| tab characters | \t |
| line-feed (LF) | \n |
| carriage-return (CR) | \r |
| comments | Varied |
Comments
Comments in God only come in one variety, line comments:
# this is a line comment, occupying an entire line.
name = "Will"; # this is another line comment, occupying only the end of the line
Despite Nix supporting multi-line and mid-statement comments, God does not. In practice, any meaning that can be conveyed through comments can be conveyed well enough by line comments (whether they occupy a full line or only the end of one), making the other types unnecessary.
Terminology explanation
Terminology regarding comments has garnered a lot of misconception; so in an
effort to make it perfectly clear, here are the different types of comments,
using C as the language to demonstrate.
Multi-line "Block" comments:
int main() {
/* this is foo
* it equals two
* */
int foo = 2;
return 0;
}
Line comments:
int main() {
// this is foo
int foo = 2; // it equals two
return 0;
}
Mid-statement comments:
int main() {
int /* this is foo, it equals two */ foo = 2;
return 0;
}
As stated above, God only supports line comments, the other types are invalid.