AI Content Moderation

Creating safer communities for all

At Social Plus, we leverage AI for automatic content moderation to ensure that you have a safe online environment. We offer two types of AI moderation:

Pre-Moderation: Content is automatically reviewed before it is posted. Our AI system scans the content upon upload and generates a confidence value. If this confidence value is equal to, or higher than the configured threshold, the content will be blocked from being posted.
Post-Moderation: Post-moderation occurs after content has been posted. Our AI system considers two main factors, flagConfidence and blockConfidence. It scans the content posted and generates a confidence value. Based on this value, different actions are automatically taken - depending on the configured threshold.

Enabling AI Content Moderation

To enable the AI Content Moderation feature, please contact our support team. Once the support team raises a task ticket, our development team will enable the feature for you.

AI Pre-Moderation

Our AI pre-moderation feature is currently only available for image moderation. It ensures that all uploaded images are scanned for inappropriate, offensive, and undesirable content before it is published. Our AI system scans and detects undesirable content in images across 4 categories:

Nudity
Suggestive content
Violence
Disturbing

To enable image moderation, log in to Social Plus Console. Under Moderation > Image Moderation, toggle "Enable image moderation" to "ON".

Once you've enabled image moderation, you will need to set the confidence level for each moderation category. In the context of content moderation, confidence levels represent the degree of certainty the AI system has in identifying specific content categories within an uploaded image. It is crucial to set confidence levels for each moderation category to fine-tune the system's sensitivity in a way that meets your moderation needs.

By default, confidence levels are set to "0" for each category. A confidence level of "0" implies a low threshold, making the system more likely to block images, potentially resulting in false positives, even if the content is not inappropriate. For more accurate and reliable results, it is recommended to set confidence levels at a higher threshold. A higher confidence value indicates a stronger certainty in the content classification.

When enabled, our AI system will scan images uploaded in posts, comments, and messages and return a confidence value. If this confidence value is equal to, or higher than the configured threshold, the content will be blocked from being posted. The undesirable image will need to be removed by the user for their post, comment, or message to be posted.

AI Post-Moderation

Our AI post-moderation feature goes beyond the basics, offering an enhanced moderation experience. It scans a wider range of content types across more moderation categories and provides more flexibility in configuring actions to be taken based on flag and block confidence levels.

Our AI post-moderation feature supports moderating text, image, and video content within messages and posts, and text and image in comments.

We will be supporting AI livestream post-moderation soon!

AI Text Post-Moderation

Our AI text post-moderation feature detects and moderates text on posts, comments, and messages that are:

Sexually explicit or adult in certain situations
Sexually suggestive or mature in certain situations
Offensive in certain situations

AI Image & Video Post-Moderation

Our AI image and video post-moderation feature detects images and videos on posts and messages, and images in comments that contain imagery depicting:

Adult Toys
Air crash
Alcohol
Alcoholic Beverages
Bare-chested Male
Corpses
Drinking
Drug Paraphernalia
Drug Products
Drug Use
Drugs
Emaciated Bodies
Explicit Nudity
Explosions and blasts
Extremist
Female Swimwear Or Underwear
Gambling
Graphic Female Nudity
Graphic Male Nudity
Graphic Violence Or Gore
Hanging
Hate Symbols
Illustrated Explicit Nudity
Male Swimwear Or Underwear
Middle Finger
Nazi Party
Nudity
Partial Nudity
Physical Violence
Pills
Revealing Clothes
Rude Gestures
Self Injury
Sexual Activity
Sexual Situations
Smoking
Suggestive
Tobacco
Tobacco Products
Violence
Visually Disturbing
Weapon Violence
Weapons
White Supremacy

How It Works

Moderation is performed using two main factors - flagConfidence and blockConfidence. When a post, comment, or message is successfully created, our AI system will automatically scan both the text and media, generating a confidence value as a result. If this confidence value falls below flagConfidence, the post, comment, and message pass moderation. In cases where the confidence value falls between flagConfidence and blockConfidence, our moderation feature will flag the content for review.

You will be able to review the flagged content in Social Plus Console or listen to Social Plus's real-time flagged post, flagged comment, or flagged message event through your webhook on the server side - allowing you to implement appropriate actions to be taken based on your moderation policies. Finally, if the confidence value surpasses blockConfidence, the system deletes the content altogether, ensuring a responsive approach to maintaining a safe online environment.

As a default configuration, all categories have an initial flagConfidence value of 40, and an initial blockConfidencevalue of 80.

Post-Moderation Description

Name

Data Type

Description

category

String

Name of each moderation's category

flagConfidence

Number

Value of moderation category’s flag confidence

blockConfidence

Number

Value of moderation category’s block confidence

moderationType

String

Value of moderation type for the category. There are 2 possible values, "text" or "media"

Specify API Endpoint

To use the enhanced moderation APIs, specify the appropriate API endpoint on each HTTP request. Each data center has a unique endpoint, so it's essential to adjust it accordingly. By selecting the correct endpoint associated with the right location, you ensure faster response times and optimize the overall performance of your API requests.

Region

API Endpoint

Europe

https://beta-eu.amity.services/

Singapore

https://beta-sg.amity.services/

United States

https://beta-us.amity.services/

Get Moderation Confidence Level

Get moderation confidence level API can be used to retrieve the confidence level for each moderation category. The API will return a list of moderation categories along with the corresponding confidence level.

Update Moderation Confidence Level

Set moderation confidence level API can be used to set confidence level to any moderation category. The API will return list of moderation categories along with the corresponding confidence level.

We recommend adjusting the confidence levels for each category based on your moderation policies and the needs of your community. By doing so, you can ensure that Enhance Moderation provides the optimal level of moderation for your platform.

Last updated 1 month ago

AI Content Moderation

Enabling AI Content Moderation

AI Pre-Moderation

AI Post-Moderation

AI Text Post-Moderation

AI Image & Video Post-Moderation

How It Works

Post-Moderation Description

Specify API Endpoint

Get Moderation Confidence Level

Update Moderation Confidence Level

Enabling AI Content Moderation

AI Pre-Moderation

AI Post-Moderation

AI Text Post-Moderation

AI Image & Video Post-Moderation

How It Works

Post-Moderation Description

Specify API Endpoint

Get Moderation Confidence Level

Retrieve confidence level for each moderation category

Update Moderation Confidence Level

Set confidence level to the moderation category

Retrieve confidence level for each moderation category

Set confidence level to the moderation category