How I Can Check If A String Is Likely To Be Generated By A Bot?
I have spam issue. Some bot (I believe) is getting around Google recaptcha and inserting strings like the following into forms on my site:
dtbNPRpfcz
VvAJEXqueSKscY
Does anyone know of any JS or C# code I can use that would give a high probability of indicating that the above string is randomly generated?
If I could check the fields being filled and know that several of them were likely to be bot generated then I could block the submission.
The above strings seem to have more than a normal number of ucase chars for example.
Update: Currently looking at using a password strength checker against some of the strings. If the string is above weak then it's likely to be spam. My web host said "try another recaptcha".
Update:
Well. I've learned a lot over this and gained some useful code so thank you very much for your input and answers. However, after ignoring the problem for the weekend I looked at it again. I noticed that the spam bot was getting around ALL the form validation. Then the penny dropped. The bot was going directly to route and posting to it. I had not set up CSRF (Cross Site Request Forgery). This meant an agent could post to the url from outside the site's domain. Doh!
I had added this to the forms:
@Html.AntiForgeryToken()
But some of my routes were missing the code to check it:
try
{
this.ValidateCsrfToken();
}
catch (CsrfValidationException)
{
return Response.AsText("Csrf Token not
valid.").WithStatusCode(403);
}
So. Apologies for wasting your time. That fixed it immediately.
Answer
Random string detection is complicated and is related to machine learning. I don't recommend to implement it on your own, perhaps spell-checking JS/C# libraries do help.
Apart from that, regarding to bot prevention, I try to make a few suggestions:
Make sure you have implemented Google recaptcha correctly. Use reCAPTCHA v3 if possible, and make sure you have verified g-recaptcha-response on backend side. Google recaptcha does not 100% reliable and can be bypassed by some Anti Captcha solutions, but correct implementation is the basic.
Filter out suspicious IP address. Block the IP address from which randomly generated strings are sent out.
Related Questions
- → How to update data attribute on Ajax complete
- → October CMS - Radio Button Ajax Click Twice in a Row Causes Content to disappear
- → Octobercms Component Unique id (Twig & Javascript)
- → Passing a JS var from AJAX response to Twig
- → Laravel {!! Form::open() !!} doesn't work within AngularJS
- → DropzoneJS & Laravel - Output form validation errors
- → Import statement and Babel
- → Uncaught TypeError: Cannot read property '__SECRET_DOM_DO_NOT_USE_OR_YOU_WILL_BE_FIRED' of undefined
- → React-router: Passing props to children
- → ListView.DataSource looping data for React Native
- → Can't test submit handler in React component
- → React + Flux - How to avoid global variable
- → Webpack, React & Babel, not rendering DOM