[Systers-dev] Database Schema for Message Storing
Gloria W
strangest at comcast.net
Fri Jul 23 09:31:34 PDT 2010
No problem, I respect your opinion. But irrespective of this, please do
pass me whatever evidence you have found to lead you to believe the
MongoDB is unfit for production. I have never heard of such a thing, I
would really dislike for such an unsubstantiated rumor to spread through
a large development community. Also the source of this information you
have is extremely important to me.
Thank you,
Gloria
> I definitely don't want to get into database wars, but some things to
> think about for throughput decisions
> - for the standard use of mailman, I don't think we need a high
> throughput database. Systers has 3000 members, which I suspect is on
> the high side for mailman lists, and even on days when we are going
> fast and furious, mailman can keep up (and I don't think we are
> particularly well tuned for throughput). In fact, I consider slowness
> a feature, as it keeps people from posting even more often (they don't
> see the responses as quickly), and it's a rare topic that increased
> posting makes better :-). Maybe Terri can comment on whether there
> are situations where mailman lists need high/higher throughput
> - now, for archives, a different approach may be important -- you
> want to pull out the messages that match the request and show them (or
> at least show the first 10, and I don't know if they come out of the
> database in the order we will want to show them). That does need to
> be "fast enough", which is probably "pretty fast".
>
> We definitely need an abstraction layer that allows us to (at some
> point -- maybe not this summer) experiment with different databases,
> so that we can find the right balance between the two needs I describe
> above.
>
> Robin
>
> On Fri, Jul 23, 2010 at 1:23 AM, Gloria W <strangest at comcast.net
> <mailto:strangest at comcast.net>> wrote:
>
> On 07/23/2010 01:20 AM, Jennifer Redman wrote:
>
> On Thu, Jul 22, 2010 at 9:40 PM, Yian
> Shang<yian.shang at gmail.com <mailto:yian.shang at gmail.com>> wrote:
>
>
> I just set up and played with MongoDB for a bit and I
> added the indexes
> we'd
> probably need on the wiki:
> http://systers.org/systers-dev/doku.php/database-schema.
>
>
>
> MongoDB is INCREDIBLY unstable and in no way ready for production.
>
> I VERY STRONGLY DISAGREE. Some people simply aren't using it
> correctly. This article states the contrary, in fine detail:
>
> http://blog.boxedice.com/2010/02/28/notes-from-a-production-mongodb-deployment/
>
>
> There is
> is this slight problem of complete and totally unexpected
> data-loss.
>
> Again, I don't believe this. In this one particular case, it is
> most likely corrupted and needs to be repaired, which can happen
> to any database, irrespective of type.
>
> Also, 10gen makes it very obvious in their presentations that if
> you want immediate, verified writes (ACID compliance), you set a
> flag, and it is accomplished. They state clearly why this is
> necessary, the trade-offs for each choice, and that data integrity
> is not an issue. The person who wrote the fear-monging article
> also fails to state that even couchdb works this way, but has the
> opposite default value. Read this for an accurate assessment of
> durability of MongoDB:
>
> http://blog.mongodb.org/post/381927266/what-about-durability
>
> As an aside, these trade-offs have been made by MySql users for
> over twenty years now, when choosing between MyISAM and InnoDB, so
> this is not a new philosophy by any means.
>
>
> http://www.mikealrogers.com/2010/07/mongodb-performance-durability/
>
> If you want to play with a NoSQL db look at CouchDB -- but I
> highly advise
> against using MongoDB for anything.
>
> I strongly disagree with this statement, and know many who use it
> in production for high throughput, have not lost data, and are
> happy with the results. I am also going to ask you for your data
> substantiating this statement, since I cannot find any. If you're
> going to point to this one article, this will not suffice.
>
> If you want to switch to a non-relational db server then you
> need to do some
> analysis first on why that is a correct decision and then
> evaluate all the
> tools. Stability and extensibility is extremely important.
> How many people
> running Mailman are going to be willing to run MongoDB? Or
> CouchDB for that
> matter?
>
> I think people will be willing than you realize, given the
> throughput benefits. Like I said before, db architects have been
> making the ACID vs. non-ACID compliance decision for many years,
> and this is not a new concept.
>
> Also, I am more fearful of projects which use relational, heavily
> indexed, slow-insert databases on potentially high traffic
> projects, without doing the analysis first on where this model
> will fail. Your view of having to do analysis to prove the need
> for a non-relational db strikes me as odd. We need to do analysis
> to prove that a relational db is the correct solution to this
> problem, and will not so badly affect performance that it brings
> Mailman to it's knees.
>
> How many people are going to be willing to adjust their SHMMAX
> setting sot make it run faster, and potentially bring their
> machine to a grinding halt? How many people will be willing to
> auto-vacuum their db hourly? How many others will be willing to
> change the MySQL indexing scheme mid-stream, from InnoDB to the
> less reliable, non ACID compliant yet faster-insert model
> (MyISAM)? How many others will feel as if they have to switch to
> a much more expensive architecture, just to allow the relational
> db to keep up with massive amounts of single-record inserts? Most
> certainly, there's trouble ahead using this model, and analysis
> needs to be done for this model.
>
> Maybe the solution is to write a flexible DB API, and retrofit one
> relational and one non-relational db solution to it. No single
> group of developers should be making a predetermined decision
> about what database soultions would be appropriate for anyone
> else's mail traffic, IMHO. There is no one-size-fits-all solution
> to this issue. So I propose that we make this solution flexible
> via a DB API, and provide two preconfigured, out-of-the-box
> solutions while allowing the user to come up with their own.
>
> Gloria
>
>
>
> Gloria
>
>
>
>
> To unsubscribe from this conversation, send email to
> <systers-dev+database3+unsubscribe at systers.org
> <mailto:systers-dev%2Bdatabase3%2Bunsubscribe at systers.org>> or
> visit
> <http://systers.org/mailman/options/systers-dev?override=147&preference=0
> <http://systers.org/mailman/options/systers-dev?override=147&preference=0>>
> To contribute to this conversation, use your mailer's reply-all or
> reply-group command or send your message to
> systers-dev+database3 at systers.org
> <mailto:systers-dev%2Bdatabase3 at systers.org>
> To start a new conversation, send email to
> <systers-dev+new at systers.org <mailto:systers-dev%2Bnew at systers.org>>
> To unsubscribe entirely from systers-dev, send email to
> <systers-dev-request at systers.org
> <mailto:systers-dev-request at systers.org>> with subject unsubscribe.
>
>
To contribute to this conversation, send mail to <Robin Jeffries >
More information about the Systers-dev
mailing list