Filter Out Unkown Signs Php Guestbook

- 1 answer

I have a website with a guestbook on it. It is built with php (codeigniter).

For filtering out 'bad' words, I am using my own 'blacklist' of words. If the guestbook comment contains a 'bad word', points are added to a counter. If in the end the counter is more then 2, the comment is reported as spam and it won't be allowed.

This had been going fine until a few weeks back.

I keep getting comments with just ??????? 's. In my filters, I have added ???? to my blacklist. Every time ???? or more ?'s are added to a comment 2 points are added and the comment will be considered as spam.

if (strpos($comment,'????') !== false) 
            $points  = $points + 2;

And it works, when trying to add comments like "??? ?? ????????????", the comments are blocked.
But I still keep getting spam with only ?'s in it. So I think the problem lies elsewhere. I think the input is like arabic or chinese and that the input isn't recognised and translated into ?'s.

But I still remain with spam in my guestbook.

How could I solve this?



mb_detect_encoding will tell you the codepage, it is represented as '?'s because your database isn't set to the correct collation. Hope this helps...