Pre-Moderation
Block inappropriate content before it’s published with proactive AI scanning
Post-Moderation
Monitor and review published content with intelligent flagging and automated actions
Overview
social.plus offers two complementary AI moderation approaches:Pre-Moderation
Pre-Moderation
Proactive Content Filtering
- Content is scanned before publication
- AI generates confidence scores for detected violations
- Content blocked if confidence exceeds configured threshold
- User must modify content to proceed with posting
Post-Moderation
Post-Moderation
Reactive Content Review
- Content is scanned after publication
- Uses
flagConfidence
andblockConfidence
thresholds - Automatically flags content for review or removes violations
- Maintains community safety without blocking legitimate content
Getting Started
1
Enable AI Moderation
Contact our support team to enable AI content moderation for your application.
2
Configure Settings
Set up confidence levels and moderation categories through the social.plus Console.
3
Test & Monitor
Test with sample content and monitor moderation effectiveness through analytics.
AI Pre-Moderation
Prevent inappropriate content from reaching your community with proactive AI scanning. Pre-moderation ensures all content meets your standards before publication.Current Availability: Pre-moderation is currently available for image content, with text and video support coming soon.
Image Content Detection
Our AI pre-moderation scans all uploaded images for inappropriate content across four key categories:Content Categories
Content Categories
- Nudity: Detection of explicit or inappropriate nudity
- Suggestive Content: Sexually suggestive or provocative imagery
- Violence: Violent or graphic content detection
- Disturbing Content: Content that may be psychologically disturbing
Configuration
1
Enable Image Moderation
Navigate to Moderation > Image Moderation in your social.plus Console and toggle “Enable image moderation” to ON.
2
Set Confidence Levels
Configure confidence thresholds for each category based on your community standards.
3
Test Configuration
Upload test images to verify your confidence settings work as expected.
Understanding Confidence Levels
Important: Confidence levels significantly impact moderation accuracy. Default settings may produce false positives.
- Low Confidence (0-30): High sensitivity, may block legitimate content
- Medium Confidence (40-70): Balanced approach for most communities
- High Confidence (80-100): Conservative filtering, may miss some violations
Recommendation: Start with medium confidence levels (40-60) and adjust based on your community’s needs and false positive rates.
AI Post-Moderation
Monitor and moderate published content with intelligent detection and automated response workflows. Post-moderation provides comprehensive scanning across all content types while maintaining user experience.Text Moderation
Detect inappropriate language, hate speech, and harmful text content
Image & Video
Scan visual content for policy violations and harmful imagery
Automated Actions
Configure intelligent responses based on confidence levels
Content Coverage
Supported Content Types
Supported Content Types
Posts
- Text, images, videos, and livestream content
- Full multimedia content analysis
- Community-specific rule application
- Text and image content scanning
- Context-aware threat detection
- Reply chain analysis
- Text, image, and video content in direct messages
- Private conversation safety monitoring
- Bulk message pattern detection
Text Content Detection
Our AI text moderation identifies and handles various types of inappropriate text content:Detection Categories
Detection Categories
- Sexually Explicit Content: Adult content and explicit sexual references
- Suggestive Content: Sexually suggestive or mature language
- Offensive Language: Hate speech, harassment, and abusive language
- Harmful Content: Self-harm, violence, and dangerous activities
Multimedia Content Detection
Comprehensive Scanning: Our AI analyzes both static images and video content frame-by-frame for maximum protection.
Adult Content
Adult Content
- Adult Toys, Explicit Nudity, Graphic Nudity
- Sexual Activity, Sexual Situations, Suggestive Content
- Swimwear, Underwear, Revealing Clothes
- Partial Nudity, Illustrated Explicit Nudity
Violence & Harmful Content
Violence & Harmful Content
- Graphic Violence, Gore, Physical Violence
- Weapons, Weapon Violence, Explosions
- Self Injury, Hanging, Corpses
- Emaciated Bodies, Visually Disturbing Content
Substance-Related Content
Substance-Related Content
Extremist & Hate Content
Extremist & Hate Content
- Extremist, Nazi Party, White Supremacy
- Hate Symbols, Rude Gestures, Middle Finger
Other Restricted Content
Other Restricted Content
- Gambling, Air Crash, Disasters
- Bare-chested Male (context-dependent)
- Other contextually inappropriate content
Understanding Confidence Scores
Confidence Thresholds
Confidence Thresholds
Flag Confidence (Default: 40)
- Content scoring above this level gets flagged for review
- Lower values = more content flagged (higher sensitivity)
- Recommended range: 30-60 depending on community standards
- Content scoring above this level gets automatically removed
- Higher values = fewer false positives
- Recommended range: 70-90 for balanced protection
Score Ranges
Score Ranges
- 0-39: Content passes moderation (approved)
- 40-79: Content flagged for human review
- 80-100: Content automatically blocked/removed
Default Configuration: All categories start with
flagConfidence: 40
and blockConfidence: 80
. Monitor your community’s content patterns and adjust these values to optimize for your specific needs.Configuration Parameters
Parameter Reference
Parameter Reference
Parameter | Type | Description |
---|---|---|
category | String | Name of the moderation category |
flagConfidence | Number | Threshold for flagging content (0-100) |
blockConfidence | Number | Threshold for blocking content (0-100) |
moderationType | String | Type of content: “text” or “media” |
API Configuration
Select the appropriate API endpoint for your region to ensure optimal performance:
Region | API Endpoint |
---|---|
Europe | https://api-eu.social.plus/ |
Singapore | https://api-sg.social.plus/ |
United States | https://api-us.social.plus/ |
API Reference
For detailed administration workflows, see the Moderation Overview and analytics export documentation.
Best Practices
Configuration Strategy
Configuration Strategy
- Start Conservative: Begin with moderate confidence levels and adjust based on results
- Monitor Performance: Track false positive and false negative rates
- Community-Specific: Tailor settings to your community’s content standards
- Regular Review: Periodically review and update thresholds as your community evolves
Human Oversight
Human Oversight
- Review Queue Management: Ensure consistent review of flagged content
- Moderator Training: Train team on community standards and edge cases
- Appeal Process: Provide clear paths for users to contest moderation decisions
- Transparency: Communicate moderation policies clearly to users
Performance Optimization
Performance Optimization
- Batch Processing: Handle high-volume content efficiently
- Regional APIs: Use geographically appropriate endpoints
- Webhook Integration: Implement real-time event handling for flagged content
- Monitoring: Set up alerts for unusual moderation patterns