At CurrySoftware we use recfiles combined with git for all business-processes (incoming and outgoing invoices, customers, etc).
It allows us to automate everything we want with simple bash scripts. But we remain flexible because we can perform non-automated tasks manually.
Your processes made me think about a email based interface instead of a bash script, this may allows to easily interact with the database bot without knowing bash or python.
Harder, since they don't have an open API and don't want people using non-standard clients. The clients are open-source though, so probably you can do it.
Apparently you can get output in csv (natively) and json (python one liner) without too much assle (http://swick.2flub.org/recutils_JSON_output.html) which suddently makes it even more interesting.
I have done some prototyping on a similar idea, but I think with a more idiomatic approach. The idea is mostly adding relational structure (schema) to CSV, and enabling a cleaner lexical syntax (get rid of the line noise).
Might some day dust it off and try to bring it to a more serious level (performance, tooling etc).
YAML 1 is an example of a hierarchical data storage format which is much more readable than XML. The problem with YAML is that it was designed as a “data serialization language” and thus to map the data constructs usually found in programming languages. That makes it too complex for the simple task of storing plain lists of items.
I dont see how this is true. Provided sample with books is almost identical in yaml.
The main benefit over yaml looks like more control of individial fields but again, yaml based db app could do that too.
This reminds me a bit of using CGI.pm's "Save" function. I built a pretty decent invoicing app using that and the searches for data in documents saved in that format are pretty fast.
I won't pretend to know the ins-and-outs of that but was told on a Perl mail list that the server created a "B-Tree" index when an initial search was made and used that afterwards.
I guess these are quite slow (because no indexing) once you have a serious number of records? That in itself isn't a problem as long as you understand the scope of the project. I wonder why they didn't use (a well-defined subset of) CSV as the format however.
And I don't think the performance issue exists.
Computers are fast nowadays. Parsing recfiles is straightforward.
Also you could easily archive historic/old/probably irrelevant records.
This is why I was very careful to say "well-defined subset". I wrote a full CSV library[1], and so I'm well aware of how deceptively difficult CSV is to deal with. However with a well-defined subset (and perhaps not using "," as a separator as well) it should be editable for at least simple changes.
CSV locks you to the same fields per data entry, that makes it a little less flexible. Plus one of the appeals for me is readability of the raw data; CSV gets long and thin very quickly. Granted each will best suit a certain type of data.
But, any indexing system you create won't work with the rec* tools. For example, `recsel` will not be any faster on large files.
Not sure if they have indexing on the roadmap, but it does make sense to me for people that have adopted it and are starting to get bigger databases.
Of course, you could argue that when the files get too big, it's time to switch to a different solution.
It seems that's kind of a natural tension in projects. Do you grow the scope to accommodate existing users with growing use cases? Or, do you draw the line in the sand and have people move on to a different solution?
This reminds me that kame.net has a turtle as a logo, but as an incentive to upgrade, when accessed over IPv6, the turtle is animated. So just be grateful the recutils developers didn't have IPv6 when they were looking for inspiration.