Forum:Spam In Rss Feeds
3
7
Entering edit mode
11.2 years ago

I don't know much about RSS, but it seems like occasionally I get spam1 posts2 from biostar in my RSS feed and when I click on them, they're always deleted already. Is there any way to delete them from the RSS feed immediately when they're deleted from biostar so they never show up in the RSS feed?

I'm using the feed http://www.biostars.org/feeds/latest/

meta RSS • 5.1k views
ADD COMMENT
1
Entering edit mode

Glad I'm not the only person plagued by this. I think I had more spam than actual comments from BioStar in my RSS feed today

ADD REPLY
0
Entering edit mode

+1, An additional problem is the BioStar twitter feed. Perhaps it's on the same RSS? I checked twitter last night and saw the nine spam posts and deleted them, but those 9 spam tweets from BioStar are going out to 421 followers.

ADD REPLY
4
Entering edit mode
11.2 years ago

The issue is with the RSS reader caching the results. Right now if a post is deleted after your RSS reader connects it will stay in your RSS cache even after the post is deleted from the feed.

The amount of spam seems to be increasing - it needs to be fixed at submission level. I'll add a both a "honeypot" form field and a simple extra validation field to the post interface. We could also add either a re-captcha (but I don't like solving those ;-), they make me squint) or a spam classifier to proofread the posts.

But we'll try a simpler solution first.

ADD COMMENT
1
Entering edit mode

There does seem to be a "wave of spam" recently and it gives a bad impression when feeds become contaminated. Ideally, it would not reach them in the first place. Unfortunately, I don't know if there is an easy semi-automated way to check when users first register. We could require moderation of first posts but that would be a lot of work for moderators and slow things down.

ADD REPLY
4
Entering edit mode

Moderator approval of the first post seems reasonable to me, as long as it can trigger an email notification or something. I'm happy to approve some posts if it means cutting down on this nonsense. If Istvan's new countermeasures don't work out, we could give it a shot.

ADD REPLY
1
Entering edit mode

another measure I could add is that that post only show up in RSS after two-three hours - that way we have have time to delete the obvious spam.

ADD REPLY
2
Entering edit mode

Strangely all seem to be pointing to australian domains. I think it is one aussie bot that got us figured out. Now we made our move, let's see what happens.

ADD REPLY
4
Entering edit mode

Yes, I apologize on behalf of real Australians everywhere :)

ADD REPLY
0
Entering edit mode

By the way, I get an error message in the admin console when trying to delete spam user accounts. I know that deletion makes no practical difference, it just makes me feel better :)

ADD REPLY
1
Entering edit mode

no don't delete those because then they can just come back - they need to stay banned. If you ban a user then their posts will be destroyed within six hours.

ADD REPLY
0
Entering edit mode

Good to know. I will curb my irritation :)

ADD REPLY
3
Entering edit mode
11.2 years ago

Biostar version 1.2.14 released. Added two levels of spam defense to the New Post page.

  1. New posts need to have an extra field filled in - I am curious to see how this will be handled by bots and how soon until it is routinely defeated
  2. New users (defined as within 6 hours of account creation) may only create 3 posts. This is to give us chance to ban runaway bots. An error message informs the user when the effect is triggered.

Also added a notification on how many votes a user has acquired since their last visit. This will be expanded in a next commit to show the actual posts that generated the votes.

ADD COMMENT
0
Entering edit mode

I think point (1) has an answer already :(

ADD REPLY
1
Entering edit mode

They even fill in the about-me section! Now that I think about it that is also a valuable outlink that makes Biostar contribute to their pagerank! I guess we need to hide the autobiography for banned users.

ADD REPLY
2
Entering edit mode

also as I look at it more, this is not a usual spam. The content is too lengthy, verbose. The advice even makes sense in some other context. It looks like something tailored to fool search engines to make it look legitimate content then use our page-rank to increase theirs. They don't actually expect our users to click/buy their stuff. I will investigate this more closely.

ADD REPLY
1
Entering edit mode

New measures are now in place. I removed the extra field after all it has already been defeated.

RSS feeds are now on a four hour delay this gives us time to delete posts without pushing them to the readers. Banned users will have their about pages and websites removed. Posts by banned users are pruned every six hours (removed beyond deletion). Brand new users may only create two new posts then have to wait a few hours to post more.

ADD REPLY
1
Entering edit mode

I feel like the 4-hour delay is a little long. In the interest of timely answers, can we cut it to 2 hours? Do we have enough international coverage with our moderators to cover spam deletion that quickly when it crops up at odd hours?

ADD REPLY
0
Entering edit mode

It could be too long - let's leave the 4 hours on for a few days until the next update then I will change it to a shorter time 2h - then we'll have some observations that help make a decision.

The internal update policy that I am trying to stick to is that no matter how small the change all tests need to be run and must pass before deploying so I like to group changes into batches. Tests include both API and functional test through the browser where the runner fires up a browser and performs a long seriers of actions (via selenium) that someone is watching.

ADD REPLY
0
Entering edit mode

I'm still seeing spam as of a few hours ago, is this just a function of settings not taking effect yet?

ADD REPLY
0
Entering edit mode

Is this old or new spam? Usually the RSS entries are stored in the user's reader (to allow for quick access) so once an entry makes into there we can't delete it. It will eventually expire. The rule just delays the new entries from entering the feed.

ADD REPLY
0
Entering edit mode

I'm still seeing it, 8 hours ago from this message id: 62412, 62408 - still the same Melbourne based rubbish

ADD REPLY
1
Entering edit mode

Just to make sure we are talking about the same thing. You can see a post in your feed that does not exist on the site anymore.

An RSS reader is a post collector not a mirror of what is on the site. Once an item enters a reader it will stay there even if the originating site removes it from the feed. There is no way to go back and tell a reader that a post that has been seen in the feed before should not be displayed.

The time at which a post gets removed from an RSS feed depends solely on the the settings of the reader and is usually a simple expiration.

ADD REPLY
1
Entering edit mode

This is a good explanation. I think the issue is, for people who like to use RSS, seeing a lot of spam is upsetting and makes them question the utility of the site. Obviously the ideal scenario would be to prevent all or most of it from ever being posted.

ADD REPLY
0
Entering edit mode

Correct, that is what I mean, but I mainly select the content I visit on BioStar by parsing my RSS feed. The fact that is still polluted by spam is not a function of how fast my RSS reader updates but a function of the failure to stop spambots signing up ;)

ADD REPLY
0
Entering edit mode

we have slowed the spam but not stopped it - I am thinking of a measure where new users would have to pass a mini captcha possibly with a bioinformatics question rather than an image.

ADD REPLY
0
Entering edit mode

Just be careful - don't want to discourage either new bioinformatics folks or ESL users

ADD REPLY
0
Entering edit mode

Thanks, sounds good!

ADD REPLY
0
Entering edit mode

FYI, a spam post still got through to the RSS last night. (id 62389). Just another data point, in case you hadn't noticed

ADD REPLY
1
Entering edit mode
2.6 years ago
Malcolm.Cook ★ 1.5k

I don't know which of the above remediation are still in play now but SPAM in RSS is out of control to such a degree that it puts me off to skimming biostar... unless I need to get some CBD gummies or go on a Keto diet...

ADD COMMENT
1
Entering edit mode

thanks for the note, in the new release we forgot about checking spam in the RSS feeds. Looks like we do filter Latest News feeds, but not the other feeds. We'll add a fix soon.

ADD REPLY

Login before adding your answer.

Traffic: 3182 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6