[Systers-dev] Database Schema for Message Storing

Gloria W strangest at comcast.net
Tue Jul 27 13:26:55 PDT 2010


  Jenn, thank you. This response makes sense.

Gloria
>
>
> On Fri, Jul 23, 2010 at 1:23 AM, Gloria W <strangest at comcast.net 
> <mailto:strangest at comcast.net>> wrote:
>
>     I strongly disagree with this statement, and know many who use it
>     in production for high throughput, have not lost data, and are
>     happy with the results. I am also going to ask you for your data
>     substantiating this statement, since I cannot find any. If you're
>     going to point to this one article, this will not suffice.
>
>
> 1) I'm coming off of several conferences - Open Source Bridge and 
> OSCON - where I spent a fair amount of time talking to DB developers 
> et.al <http://et.al>.  The consensus is that MongoDB is not ready for 
> wide-scale adoption in the majority of production environments.
>
> 2) MongoDB requires 2 servers and replication for durability: 
> http://blog.mongodb.org/post/381927266/what-about-durability
>
> 3) There is a 2.5G data limit on 32bit systems - 
> http://blog.mongodb.org/post/137788967/32-bit-limitations
>
> While 2 and 3 may be acceptable for high-end users with the resources 
> to play with new (and yes innovative) technologies the average person 
> or organization running a Mailman listserv does not have those resources.
>
> All of the Systers servers are currently 32bit systems.
>
>
>         If you want to switch to a non-relational db server then you
>         need to do some
>         analysis first on why that is a correct decision and then
>         evaluate all the
>         tools.  Stability and extensibility is extremely important.
>          How many people
>         running Mailman are going to be willing to run MongoDB?  Or
>         CouchDB for that
>         matter?
>
>     I think people will be willing than you realize, given the
>     throughput benefits. Like I said before, db architects have been
>     making the ACID vs. non-ACID compliance decision for many years,
>     and this is not a new concept.
>
>     Also, I am more fearful of projects which use relational, heavily
>     indexed, slow-insert databases on potentially high traffic
>     projects, without doing the analysis first on where this model
>     will fail. Your view of having to do analysis to prove the need
>     for a non-relational db strikes me as odd. We need to do analysis
>     to prove that a relational db is the correct solution to this
>     problem, and will not so badly affect performance that it brings
>     Mailman to it's knees.
>
>
> Generally speaking it is best-practice to use what is widely available 
> and stable in production environments.  If you want to make an 
> argument for a change from the status quo you need to provide:
>
> 1) Performance measures that indicate there is indeed a problem
> 2) An analysis - both good and bad of why the change from the stable 
> technology is necessary and why the new software is better.
>
> This is especially necessary given some of the volatile development 
> processes in OS development.  This is also why Redhat has RHEL - rapid 
> adoption of innovative technologies is often very bad for the 
> stability of a production environment.
>
> In the case of MM development and specifically the archives project I 
> would expect to see the following questions addressed:
>
> 1) How does MongoDB fit in with upstream development - particularly 
> since the focus is on SQL-based DB's?
> 2) How does MongoDB improve the performance of X functionality
> 3) Is the durability/performance trade-off worth it in terms of system 
> function and end-user experience?
> 4) Is the need for wide-spread adoption of 64bit systems worth the 
> performance bump? Remember this involves lots of time and upgrades 
> across a varied user-base.
> 5) Is the investment in resources to implement replication worth the 
> performance bump?
> 6) Is the investment in training all the 1000s of MM listserver admins 
> worth the performance bump.
>
> Right now we are using Postgresql.  We have not experienced any 
> performance issues.  Postgresql is a mature project with a vibrant 
> development community and has proven to be stable - historically - 15 
> years as opposed to ~2.
>
> Additionally, the fact that MongoDB is a fairly new piece of tech - 
> frequent updates and upgrades for bug-fixes are to be expected - which 
> means more patching in the production environment - always a bad thing.
>
> So in my opinion -- MongoDB is not ready for wide-scale adoption in 
> production environments and is likely to result in stability and 
> scalability problems for the average Mailman Listserv operator and 
> probably most sysadmins running a basic LAM/PP setup.
>
> I don't want to dissuade anyone from playing with and exploring NoSQL 
> options and thinking about alternatives to RDMS -- I'm doing so 
> myself.  But MongoDB is not the right solution for this project at 
> this time.  And you need to think differently about production 
> environments - stable and scalable are key.
>
> Jen


To contribute to this conversation, send mail to <Jennifer Redman >


More information about the Systers-dev mailing list