Pre-Moderation
Block inappropriate content before it’s published with proactive AI scanning
Post-Moderation
Monitor and review published content with intelligent flagging and automated actions
Overview
social.plus offers two complementary AI moderation approaches:Pre-Moderation
Pre-Moderation
Proactive Content Filtering
- Content is scanned before publication
- AI generates confidence scores for detected violations
- Content blocked if confidence exceeds configured threshold
- User must modify content to proceed with posting
Post-Moderation
Post-Moderation
Reactive Content Review
- Content is scanned after publication
- Uses
flagConfidenceandblockConfidencethresholds - Automatically flags content for review or removes violations
- Maintains community safety without blocking legitimate content
Getting Started
Enable AI Moderation
Contact our support team to enable AI content moderation for your application.
Configure Settings
Set up confidence levels and moderation categories through the social.plus Console.
AI Pre-Moderation
Prevent inappropriate content from reaching your community with proactive AI scanning. Pre-moderation ensures all content meets your standards before publication.Current Availability: Pre-moderation is currently available for image content, with text and video support coming soon.
Image Content Detection
Our AI pre-moderation scans all uploaded images for inappropriate content across four key categories:Content Categories
Content Categories
- Nudity: Detection of explicit or inappropriate nudity
- Suggestive Content: Sexually suggestive or provocative imagery
- Violence: Violent or graphic content detection
- Disturbing Content: Content that may be psychologically disturbing
Configuration
Enable Image Moderation
Navigate to Moderation > Image Moderation in your social.plus Console and toggle “Enable image moderation” to ON.
Set Confidence Levels
Configure confidence thresholds for each category based on your community standards.
Understanding Confidence Levels
Confidence levels represent the AI’s certainty in detecting specific content types:- Low Confidence (0-30): High sensitivity, may block legitimate content
- Medium Confidence (40-70): Balanced approach for most communities
- High Confidence (80-100): Conservative filtering, may miss some violations
AI Post-Moderation
Monitor and moderate published content with intelligent detection and automated response workflows. Post-moderation provides comprehensive scanning across all content types while maintaining user experience.Text Moderation
Detect inappropriate language, hate speech, and harmful text content
Image & Video
Scan visual content for policy violations and harmful imagery
Automated Actions
Configure intelligent responses based on confidence levels
Content Coverage
Supported Content Types
Supported Content Types
Posts
- Text, images, videos, and livestream content
- Full multimedia content analysis
- Community-specific rule application
- Text and image content scanning
- Context-aware threat detection
- Reply chain analysis
- Text, image, and video content in direct messages
- Private conversation safety monitoring
- Bulk message pattern detection
Text Content Detection
Our AI text moderation identifies and handles various types of inappropriate text content:Detection Categories
Detection Categories
- Sexually Explicit Content: Adult content and explicit sexual references
- Suggestive Content: Sexually suggestive or mature language
- Offensive Language: Hate speech, harassment, and abusive language
- Harmful Content: Self-harm, violence, and dangerous activities
Multimedia Content Detection
Advanced visual content analysis covers extensive categories:Adult Content
Adult Content
- Adult Toys, Explicit Nudity, Graphic Nudity
- Sexual Activity, Sexual Situations, Suggestive Content
- Swimwear, Underwear, Revealing Clothes
- Partial Nudity, Illustrated Explicit Nudity
Violence & Harmful Content
Violence & Harmful Content
- Graphic Violence, Gore, Physical Violence
- Weapons, Weapon Violence, Explosions
- Self Injury, Hanging, Corpses
- Emaciated Bodies, Visually Disturbing Content
Substance-Related Content
Substance-Related Content
Extremist & Hate Content
Extremist & Hate Content
- Extremist, Nazi Party, White Supremacy
- Hate Symbols, Rude Gestures, Middle Finger
Other Restricted Content
Other Restricted Content
- Gambling, Air Crash, Disasters
- Bare-chested Male (context-dependent)
- Other contextually inappropriate content
Understanding Confidence Scores
Confidence Thresholds
Confidence Thresholds
Flag Confidence (Default: 40)
- Content scoring above this level gets flagged for review
- Lower values = more content flagged (higher sensitivity)
- Recommended range: 30-60 depending on community standards
- Content scoring above this level gets automatically removed
- Higher values = fewer false positives
- Recommended range: 70-90 for balanced protection
Score Ranges
Score Ranges
- 0-39: Content passes moderation (approved)
- 40-79: Content flagged for human review
- 80-100: Content automatically blocked/removed
Default Configuration: All categories start with
flagConfidence: 40 and blockConfidence: 80. Monitor your community’s content patterns and adjust these values to optimize for your specific needs.Configuration Parameters
Parameter Reference
Parameter Reference
| Parameter | Type | Description |
|---|---|---|
category | String | Name of the moderation category |
flagConfidence | Number | Threshold for flagging content (0-100) |
blockConfidence | Number | Threshold for blocking content (0-100) |
moderationType | String | Type of content: “text” or “media” |
API Configuration
- Regional Endpoints
- Configuration APIs
Select the appropriate API endpoint for your region to ensure optimal performance:
| Region | API Endpoint |
|---|---|
| Europe | https://api-eu.social.plus/ |
| Singapore | https://api-sg.social.plus/ |
| United States | https://api-us.social.plus/ |
API Reference
For detailed administration workflows, see the Moderation Overview and analytics export documentation.
Best Practices
Configuration Strategy
Configuration Strategy
- Start Conservative: Begin with moderate confidence levels and adjust based on results
- Monitor Performance: Track false positive and false negative rates
- Community-Specific: Tailor settings to your community’s content standards
- Regular Review: Periodically review and update thresholds as your community evolves
Human Oversight
Human Oversight
- Review Queue Management: Ensure consistent review of flagged content
- Moderator Training: Train team on community standards and edge cases
- Appeal Process: Provide clear paths for users to contest moderation decisions
- Transparency: Communicate moderation policies clearly to users
Performance Optimization
Performance Optimization
- Batch Processing: Handle high-volume content efficiently
- Regional APIs: Use geographically appropriate endpoints
- Webhook Integration: Implement real-time event handling for flagged content
- Monitoring: Set up alerts for unusual moderation patterns