I’m still playing around with Mnesia and CouchDB as discussed in the previous posts, but one of the purposes of the test project I’m doing at the moment is to be able to compare various approaches. There doesn’t seem much point playing with new and exciting things without making a comparison with more tried and tested ways, which is my justification for importing this data a third time.
So, on top of my Debian + Erlang + Ejabberd + Jabberlang + YAWS + ErlyWeb + CouchDB setup arrived at earlier, I’ve installed MySQL:
sudo apt-get install mysql-server mysqladmin -u root password sensiblepwd
I wrote a quick script to set up the database to hold some core data, with the plan being to migrate the rest away from the client once this part is up and running. I decided to use Python to import the data, which even with my limited Python skills should have extremely simple. I was over the moon when my hacked together import code seemed to be working first time, but then it hit some UTF8 and all hell broke loose.
The problem was the MySQL driver for Python, which didn’t handle UTF-8 (yes, tripped up by UTF-8 handling 3 out of 3 times now) in any of the packaged versions. Figuring this out and getting the right dependencies in place to build the latest release involved a painful mixture of Googling and trial and error, but the end result was this:
sudo apt-get install libmysqlclient15-dev python-dev wget http://downloads.sourceforge.net/mysql-python/MySQL-python-1.2.2.tar.gz cd MySQL-python-1.2.2 python setup.py build sudo python setup.py install
Finally my importer came up with the result I was expecting (Finished – 6714 records, 84 tags, 9926 links) and I could repeat the ErlyWeb setup I did before, but this time using the MySQL driver and the latest (revision 174) ErlyWeb from SVN.
One insight that immediately came to mind is that while it was blatantly obvious to me how to structure the data for MySQL, it is less so for Mnesia, and for CouchDB I am still not entirely sure of the best approach yet. For now, here are a couple of the more obvious points:
- A ‘record’ has a ‘numplays’ field which obviously indicates how many times it has been played. This is fine in MySQL, but if I just incorporate this field into a document for CouchDB I will get a complete duplicate revision of the document in the database every time this one number changes, which seems awfully inefficient.
- The three-table layout in MySQL of records, tags, and a link table between them (see the script if that isn’t clear) is (to me at least) obviously the right solution, but there are many possible ways of doing this in both Mnesia and CouchDB and I find I don’t intuitively have the answers.
More on the above as time goes on obviously, along with some practical results and measurements, and probably more tales of UTF-8 handling woes.
-
hey,
your at the same point as me i think.
How to best use CouchDB. I too am having trouble doing exactly the same thing.

1 comment
Comments feed for this article
Trackback link: http://ciarang.com/posts/couchdb-vs-mnesia-vs-mysql/trackback