Automatically locate and classify personally identifiable information (PII) within text content (Posts, Comments, Messages) and optionally redact it before display or export.
Why It Matters
Moderating user-generated content requires balancing safety, privacy, and transparency. PII Detection helps you:- Reduce accidental exposure of sensitive user data.
- Enforce compliance (GDPR, SOC2 readiness, internal data handling policies).
- Streamline moderator workflows with structured PII metadata.
- Give client apps a consistent way to mask or highlight sensitive snippets.
Core Capabilities
Multi-Object Support
Works uniformly on Post, Comment, and Message text bodies.
Structured Metadata
Offset + length + category + confidence for each detection.
SDK Redaction API
Simple helpers to produce redacted strings per category set.
Quick Start
1
Enable Feature Flag
Contact Support to enable PII Detection for your network (Max plan+).
2
Create / Update Content
Users submit posts, comments, or chat messages as usual.
3
Fetch Objects
Retrieve objects through the SDK; each may contain zero or more PII entities.
4
Apply Redaction (Optional)
Call the redaction helper with category filters + replacement char.
5
Render / Export Safely
Display redacted text to end users; log or export masked forms.
Availability & Billing
PII Detection is available on Max plan and above. It is disabled by default; contact Support to have the feature flag enabled for your network.
Each processed text item (post, comment, or message) that runs through PII Detection counts toward your AI Moderation quota. Large backfills or bulk imports can consume quota quickly—coordinate with Support before running retrospective scans.
Staging / sandbox environments share quota policies; throttle batch operations or segment traffic if you are load‑testing PII detection.
High-Level Flow
- User submits text (post/comment/message create or update).
- Backend pipeline runs PII classification (synchronous or queued depending on volume).
- Detected entities are appended to the stored object as an embedded
pii
(or array) structure. - Fetching the object via SDK includes PII metadata.
- Client optionally calls helper to generate a redacted representation for UI rendering, logs, or exports.
Data Model Extension
The core schema is augmented with an embedded PII block per detection. (Shown conceptually below — actual shape may be array-based in future revisions.)If multiple PII entities are present, SDKs may expose them as a list. Plan for zero, one, or many.
Typical Moderation Use Case
As an admin, I want to filter the moderation feed by AI-detected topic so I can efficiently review content flagged under specific PII categories.You can index
pii.category
in internal tools to provide: facet filters, auto-redaction toggles, and export safeguards.
Redaction Strategy
Client apps decide how to display sensitive text:- Full mask (default replacement char, e.g.
*
). - Partial mask (apply only to specific categories, e.g. email + phone, keep names visible).
- Contextual highlight (do NOT replace, but visually annotate; implement using offsets).
Redaction is a presentation concern. Raw original content remains accessible to authorized roles unless you also purge/transform at the source.
SDK Usage
PII data is accessed through helper methods exposed on Post / Comment / Message models. Use redaction helpers to mask selected categories before rendering.Supported today on Backend API, native Android & iOS SDKs. Web / React Native can consume metadata via API but must implement redaction manually. UIKit has no built‑in toggle yet (see integration section below).
- iOS
- Android
Accessing PII DataPII Data ObjectRedaction Helper
Cache (original, redacted) pairs if you offer a moderator reveal toggle.
Handling Multiple Entities
When multiple detections exist, either:- Apply redaction helper (handles ordering internally), or
- Sort entities by offset descending and splice manually if you need custom formatting (e.g. colored highlights).
Categories (Full Current Set)
Below is the current set of PII categories emitted by the detection service:SDK Representation (Android / Kotlin)
Below is the model structure (as provided) used within the Android SDK:Categories like
Person
, PersonType
, LicensePlate
, SortCode
, BankAccountNumber
, DriversLicenseNumber
, and generic IDs are surfaced via Category.OTHERS(value)
when a dedicated constant is not defined. Handle them gracefully (e.g., map OTHERS(“Person”)
to a user-friendly label and redaction policy).Redaction Policy Tips
- Maintain a mapping table from raw category string → display label → redact (Y/N) → replacement char.
- Keep an allowlist of safe categories to show unmasked (e.g., DateTime) if policy permits.
- Log unexpected new category strings to monitoring so you can update UI mappings.
Not all detected categories must be redacted. Align category → action mappings with your internal compliance & privacy matrix.
Confidence Handling
- Treat very low confidence (e.g. < 0.50) as informational only.
- Provide UI toggle to show/hide low-confidence detections.
- Consider server-side thresholding to reduce noise (e.g. only persist ≥0.60).
Performance & Scaling
- Detection runs server-side; client cost is O(n) over number of entities for redaction.
- Multiple entities are safe; ensure your UI redaction logic accounts for shifting indices if you transform the string manually.
Privacy & Compliance Notes
- PII metadata is derivative; deleting the parent object removes the metadata.
- For export/audit pipelines, store only redacted form unless explicit legal basis for raw text access.
- Log access to unredacted content if operating in regulated environments.
FAQ
Q: Can I force server-side redaction so raw text never leaves?A: Roadmap item; today redaction is client-driven using provided metadata.
Q: What if multiple overlapping detections occur?
A: Use highest-confidence or merge spans before rendering.
Q: Do offsets reflect UTF-8 or UTF-16?
A: Follow SDK string index semantics (documented per platform); when in doubt, validate with sample extraction.
UIKit Integration (Optional Customization)
Although UIKit does not yet ship a pre-wired PII redaction toggle, you can layer the redacted text helpers into the relevant content rendering components:Android UIKit
- Post component:
AmityPostContentElement
- Comment component:
AmityCommentContentContainer
- Fetch the
AmityPost
/AmityComment
as usual. - Call the underlying SDK method (e.g.
redactedText(...)
). - Inject the resulting string into your custom view binding inside the component.
iOS UIKit
- Post component:
AmityPostContentComponent
- Comment component:
AmityCommentView
- Retrieve the model (e.g.
AmityPost
). - Generate the redacted variant via
redactedText(...)
(choose categories / replacement char). - Assign to the text label / attributed string before layout.
Keep both original and redacted strings in memory if you offer a user role–based toggle (e.g., moderators can reveal originals). Avoid recomputing on every bind to reduce UI churn.
If you cache redacted content, ensure cache keys include the selected category set and replacement character to prevent stale or mismatched redactions.
Next: integrate PII category filters into your moderation console and add unit tests around redaction formatting.