AI User Profile Moderation

Extend your community safety beyond content with AI-powered user profile moderation. social.plus automatically scans user profiles — including display names, avatars, and descriptions — for policy violations, giving admins the tools to review, reset, and manage flagged profiles.

Post-Moderation

AI reviews profile content after save — flag-only, no auto-delete

Admin Reset

Reset flagged profile fields (display name, avatar, description) to safe defaults

Moderation Feed

Unified Users tab to review AI-flagged and user-reported profiles

Profile Blocklist

Dedicated blocklist category for display names and descriptions

Overview

AI User Profile Moderation is a post-moderation system — profiles are scanned after the user saves changes. Unlike content moderation (posts, comments, messages), user profile moderation is flag-only: the block confidence threshold that auto-deletes content does not apply to user profiles. Admins must explicitly decide to reset, clear the flag, or ban the user.

Flag-Only for Profiles: Unlike posts and messages, the block confidence threshold does not auto-delete user profile content. All flagged profiles require manual admin review. This ensures that user accounts are never silently wiped by automated systems.

What Gets Scanned

AI moderation scans three profile fields after each update:

Field	Scan Type	Pre-Moderation	Post-Moderation
Display Name	Text	❌	✅ Flagged if violation detected
Description	Text	❌	✅ Flagged if violation detected
Avatar (uploaded file)	Image	✅ Blocked before upload	✅ Flagged after save

Avatar Pre-Moderation: Uploaded avatar images are already scanned at upload time via. Post-moderation provides a second layer of review after the profile is saved.

avatarCustomUrl Gap: Avatars set via external URL (avatarCustomUrl) bypass the upload pipeline entirely — no pre-moderation image scan runs on these images. Only post-moderation scanning applies.

Detection Categories

User profile text is scanned against the same categories as posts, comments, and messages:

Text Detection Categories

Harassment or Bullying: Targeted abuse or intimidation in display names or descriptions
Sexual Content or Nudity: Explicit sexual references in profile text
Violence or Threatening Content: Violent threats or graphic descriptions
Hate: Hate speech targeting protected groups
Fraudulent Intent and Scam Promotion: Scam tactics, phishing links in descriptions
Self Harm or Suicide: Content related to self-harm or suicidal ideation

PII Detection

URL — Links and web addresses embedded in profile descriptions
PersonType — References to specific person types or identities

Image Detection (Avatar)

Avatar images are scanned for the same visual categories as other media content:

Adult content, nudity, and suggestive imagery
Violence and harmful content
Hate symbols and extremist content
Substance-related content

See AI Content Moderation — Multimedia Content Detection for the full list of image categories.

Confidence Thresholds

User profile moderation reuses the same text and image confidence thresholds configured for content moderation. However, only the flag confidence threshold applies — there is no auto-block for user profiles.

Threshold	Applies to Profiles?	Behavior
Flag Confidence (default: 40)	✅ Yes	Profile flagged for admin review
Block Confidence (default: 80)	❌ No	Does not auto-delete profile fields

Confidence thresholds are shared across content and user profiles. If you need different sensitivity for profiles, this would require a future enhancement for separate threshold configuration.

Moderation Feed — Users Tab

Flagged user profiles appear in a dedicated Users tab within the Moderation Feed (Moderation > Moderation feed), alongside the existing Posts/Comments and Messages tabs.

To Review
Reviewed

The To Review list displays all user profiles that require moderator attention.Each flagged profile shows:

AI moderation labels with category occurrence counts (e.g., “Hate (2)”, “Violence (1)”)
Whether the display name or description triggered the flag — indicated by a (Profile description is flagged) label
User report counts from other community members (e.g., “4 users”)
Last flagged timestamp

Available actions:

Reset profile — Reset flagged profile fields to safe defaults
Ban globally — Ban the user across all communities
Clear flag — Dismiss the flag and approve the profile

Admin Reset

When an admin resets a flagged profile, the affected fields are restored to safe defaults:

Field	Reset Behavior
Display Name	Set to `{configuredPrefix}{UUID(12)}` (e.g., `User_a1b2c3d4e5f6`)
Description	Cleared (set to empty)
Avatar	Removed — reverts to default avatar. Original file is deleted from storage.

Navigate to Moderation Feed

Go to Moderation > Moderation feed and select the Users tab.

Review Flagged Profile

Click on a flagged user to view their profile details, AI moderation labels, and flag history.

Choose Action

Select Reset profile from the moderation actions. A confirmation modal shows which fields will be reset.

Confirm Reset

Confirm the reset. The user is notified that their profile has been reset.

Avatar Reset is Irreversible: When an avatar is reset, the original file is deleted from storage. Only the fact that a reset occurred is logged — the original image cannot be recovered.

The display name reset prefix is configurable via the network setting moderation.displayNameResetPrefix (API-only).

User Notifications

When a profile is reset, the affected user receives a notification through the notification tray:

Notification type: USER_PROFILE_RESET
Content: Informs the user which fields were reset and that they can update their profile again
Delivered via the existing notification tray system

Profile Blocklist

A dedicated blocklist category for user profiles allows admins to block specific words and phrases from appearing in display names and descriptions.

Blocklist Category	Applies To	Description
Content (existing)	Posts, comments, messages	Standard content blocklist
User Profile (new)	Display names, descriptions	Blocks words/phrases in profile text

The profile blocklist is independent from the content blocklist. You can maintain different blocked terms for user profiles versus post/comment content.

User History

Profile moderation events are recorded in the user’s activity history, accessible from the User History page:

AI flag events: When AI detects a violation in a profile field
Profile reset events: Logged as RESET_USER_PROFILE activity with details on which fields were reset, the actor, and reason
Moderation actions: Flag cleared, globally banned, etc.

Best Practices

Review Strategy

Prioritize by severity: Review profiles with multiple AI flag categories first
Check context: Some display names or descriptions may be flagged incorrectly — review before resetting
Use reset judiciously: Resetting a profile is disruptive to the user; prefer clearing the flag for borderline cases
Monitor repeat offenders: Users who are repeatedly flagged may warrant a global ban

Blocklist Management

Separate concerns: Maintain different blocklists for content vs. user profiles
Common patterns: Add known offensive display name patterns to the profile blocklist
Regular updates: Review and update the blocklist as new patterns emerge
Test impact: Check existing users before adding broad blocklist terms

Confidence Tuning

Shared thresholds: Remember that confidence thresholds are shared between content and profiles
Monitor false positives: Track how often legitimate profile content gets flagged
Flag-only safety net: Since profiles are flag-only, a lower flag threshold is safer than with content (no risk of auto-deletion)

Limitations

Flag-only: AI cannot auto-delete user profile content — all flagged profiles require admin review
avatarCustomUrl bypasses pre-moderation: External avatar URLs are not scanned at upload time
Shared confidence thresholds: Profile and content moderation share the same flag thresholds; separate configuration is not yet available
Original avatar not recoverable: Once an avatar is reset, the original file is permanently deleted from storage

AI Content Moderation

AI moderation for posts, comments, messages, images, and video content

Moderation Overview

General moderation tools, roles, and workflows

User Insights

View detailed user information, history, and activity

Post-Moderation

Admin Reset

Moderation Feed

Profile Blocklist

​Overview

​What Gets Scanned

​Detection Categories

​Confidence Thresholds

​Moderation Feed — Users Tab

​Admin Reset

​User Notifications

​Profile Blocklist

​User History

​Best Practices

​Limitations

​Related Topics

AI Content Moderation

Moderation Overview

User Insights

Overview

What Gets Scanned

Detection Categories

Confidence Thresholds

Moderation Feed — Users Tab

Admin Reset

User Notifications

Profile Blocklist

User History

Best Practices

Limitations

Related Topics