bugs

Forum > Bugs > Double Post Bug
Reply To Thread (login)
Alan [48]
2011-07-02 15:31:42
[13 years, 146 days ago]

Got the Screen Shot. http://i.imgur.com/38DdB.png

The Source http://pastebin.com/PTv4NaVq

I think the problem is that both our posts where made at the exact same time. Rendering Problem I'd say.


 
Saiyan Z [104]
2011-07-02 15:33:26
[13 years, 146 days ago]

Hehe...I also took a screenshot of it. Was thinking of posting as bug too


 
Ender [1]
Administrator
2011-07-02 15:35:47
[13 years, 146 days ago]

Thanks for saving the source too, that should help with tracking this down. I'll take a look.


 
Intsecuris [33]
2011-07-02 15:38:53
[13 years, 146 days ago]

Always happy to help... in my own way.


 
Ender [1]
Administrator
2011-07-02 15:49:54
[13 years, 146 days ago]

All better now. :)


 
Alan [49]
2011-07-02 17:32:55
[13 years, 146 days ago]

What was happening?


 
Ender [1]
Administrator
2011-07-02 18:34:47
[13 years, 146 days ago]

The way the SQL query works for fetching the thread list, it finds the last post for each thread (by timestamp) in the posts table, joins that to the threads table, and then joins the result of that subquery back onto the posts table again (twice actually...and twice onto the bots table). That final step was being repeated though when there was more than one match. And there would of course be more than one match if there was more than one post with the "last" timestamp.

Oh hell, here's the query:

SELECT
    post_data.thread_id AS id,
    name,
    is_sticky,
    is_locked,
    replies,
    views,

    first_when,
    posts1.author_id AS first_author,
    bots1.username AS first_name,
    bots1.color AS first_color,
    posts1.level AS first_level,

    last_when,
    posts2.author_id AS last_author,
    bots2.username AS last_name,
    bots2.color AS last_color,
    posts2.level AS last_level
FROM (
    SELECT
        thread_id,
        name,
        is_sticky,
        is_locked,
        COUNT(*) - 1 AS replies,
        views,
        MIN(posted) AS first_when,
        MAX(posted) AS last_when
    FROM posts
    INNER JOIN threads ON posts.thread_id = threads.id
    WHERE forum_id = ?
    GROUP BY thread_id
    ORDER BY last_when DESC
    LIMIT ?, ?) post_data
INNER JOIN posts AS posts1 ON post_data.thread_id = posts1.thread_id AND post_data.first_when = posts1.posted
INNER JOIN bots AS bots1 ON posts1.author_id = bots1.id
INNER JOIN posts AS posts2 ON post_data.thread_id = posts2.thread_id AND post_data.last_when = posts2.posted
INNER JOIN bots AS bots2 ON posts2.author_id = bots2.id
ORDER BY last_when DESC;

I didn't see an easy way of altering the query to fix this, so I actually solved it on the PHP side. The code loops over the database results and saves them into an array, so I just made the thread id the key in the array. So when a duplicate thread comes along now, it gets clobbered in the array when there's a duplicate as a result of the key collision.

I present to you, the complete diff of the update that fixed this bug:

[mazur@bluethang trunk]$ svn diff -r 409:410
Index: www/forum-threads.php
===================================================================
--- www/forum-threads.php   (revision 409)
+++ www/forum-threads.php   (revision 410)
@@ -114,7 +114,7 @@
         ->setLastBot($last)
         ->setLastWhen($row['last_when']);

-    $threads[] = $thread;
+    $threads[$row['id']] = $thread;
 }
 $data['threads'] = $threads;

And in retrospect, I probably could have actually been able to solve it in the query by finding the max post id for each thread (instead of timestamp), but I'm not going to go back and ressolve it.

</more than you probably wanted>


 
Imperium [16]
2011-07-02 18:39:47
[13 years, 146 days ago]

Wow...Why not just update the thread with the new post timestamp. Then order by that...

Like

("UPDATE threads SET last_post = ".time()." WHERE id = ".$id."")

Then just order it...


 
Ender [1]
Administrator
2011-07-02 18:52:33
[13 years, 146 days ago]

True, that would be one way of solving this. I try to avoid solutions like that though when possible because it can lead to data anomalies if you're not careful. By duplicating a piece of data (last post timestamp for each thread in this case), you open yourself up to data inconsistencies which can in turn lead to unexpected behavior.

For example, if the last post in a thread is deleted, that timestamp has to be updated. That's a simple enough situation to handle, but that's just the most obvious. I didn't mention having to know the id, username, color, and level of the bot that made the last post. We could add additional fields on the threads table for tracking these as well, but now we have to remember to update them in many more situations (user changes bot color, user levels, user changes name (not unprecedented)). Not duplicating data completely avoids having to deal with those kinds of problems.

Note that I actually did have to do the kind of duplication you suggested for the achievements leaderboard. There's just no way ordering all bots by number of achievement points could be an efficient query without doing this. I handle the update anomaly problem with the use of database triggers, another dangerous beast, but that's a story for another day...


 
DarkNinjaMaster [57]
2011-07-02 21:37:05
[13 years, 146 days ago]

/me nods.

Yup, for sure, uh-huh, okay, right, right, yeah, well there you go then. Ah I see, very nicely done I must say.


 
Forum > Bugs > Double Post Bug
Reply To Thread (login)